grn inference algorithms: Topics by WorldWideScience.org

Sample records for grn inference algorithms

GRN2SBML: automated encoding and annotation of inferred gene regulatory networks complying with SBML.

Science.gov (United States)

Vlaic, Sebastian; Hoffmann, Bianca; Kupfer, Peter; Weber, Michael; Dräger, Andreas

2013-09-01

GRN2SBML automatically encodes gene regulatory networks derived from several inference tools in systems biology markup language. Providing a graphical user interface, the networks can be annotated via the simple object access protocol (SOAP)-based application programming interface of BioMart Central Portal and minimum information required in the annotation of models registry. Additionally, we provide an R-package, which processes the output of supported inference algorithms and automatically passes all required parameters to GRN2SBML. Therefore, GRN2SBML closes a gap in the processing pipeline between the inference of gene regulatory networks and their subsequent analysis, visualization and storage. GRN2SBML is freely available under the GNU Public License version 3 and can be downloaded from http://www.hki-jena.de/index.php/0/2/490. General information on GRN2SBML, examples and tutorials are available at the tool's web page.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Xiangyun Xiao

Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Science.gov (United States)

Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

2015-01-01

The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

Science.gov (United States)

Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

2017-08-01

Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.
Inferring regulatory networks from expression data using tree-based methods.

Directory of Open Access Journals (Sweden)

Vân Anh Huynh-Thu

2010-09-01

Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
White matter hyperintensities are seen only in GRN mutation carriers in the GENFI cohort

Directory of Open Access Journals (Sweden)

Carole H. Sudre

2017-01-01

Full Text Available Genetic frontotemporal dementia is most commonly caused by mutations in the progranulin (GRN, microtubule-associated protein tau (MAPT and chromosome 9 open reading frame 72 (C9orf72 genes. Previous small studies have reported the presence of cerebral white matter hyperintensities (WMH in genetic FTD but this has not been systematically studied across the different mutations. In this study WMH were assessed in 180 participants from the Genetic FTD Initiative (GENFI with 3D T1- and T2-weighed magnetic resonance images: 43 symptomatic (7 GRN, 13 MAPT and 23 C9orf72, 61 presymptomatic mutation carriers (25 GRN, 8 MAPT and 28 C9orf72 and 76 mutation negative non-carrier family members. An automatic detection and quantification algorithm was developed for determining load, location and appearance of WMH. Significant differences were seen only in the symptomatic GRN group compared with the other groups with no differences in the MAPT or C9orf72 groups: increased global load of WMH was seen, with WMH located in the frontal and occipital lobes more so than the parietal lobes, and nearer to the ventricles rather than juxtacortical. Although no differences were seen in the presymptomatic group as a whole, in the GRN cohort only there was an association of increased WMH volume with expected years from symptom onset. The appearance of the WMH was also different in the GRN group compared with the other groups, with the lesions in the GRN group being more similar to each other. The presence of WMH in those with progranulin deficiency may be related to the known role of progranulin in neuroinflammation, although other roles are also proposed including an effect on blood-brain barrier permeability and the cerebral vasculature. Future studies will be useful to investigate the longitudinal evolution of WMH and their potential use as a biomarker as well as post-mortem studies investigating the histopathological nature of the lesions.
Cerebrospinal Fluid Progranulin, but Not Serum Progranulin, Is Reduced in GRN-Negative Frontotemporal Dementia.

Science.gov (United States)

Wilke, Carlo; Gillardon, Frank; Deuschle, Christian; Hobert, Markus A; Jansen, Iris E; Metzger, Florian G; Heutink, Peter; Gasser, Thomas; Maetzler, Walter; Blauwendraat, Cornelis; Synofzik, Matthis

2017-01-01

Reduced progranulin levels are a hallmark of frontotemporal dementia (FTD) caused by loss-of-function (LoF) mutations in the progranulin gene (GRN). However, alterations of central nervous progranulin expression also occur in neurodegenerative disorders unrelated to GRN mutations, such as Alzheimer's disease. We hypothesised that central nervous progranulin levels are also reduced in GRN-negative FTD. Progranulin levels were determined in both cerebrospinal fluid (CSF) and serum in 75 subjects (37 FTD patients and 38 controls). All FTD patients were assessed by whole-exome sequencing for GRN mutations, yielding a target cohort of 34 patients without pathogenic mutations in GRN (GRN-negative cohort) and 3 GRN mutation carriers (2 LoF variants and 1 novel missense variant). Not only the GRN mutation carriers but also the GRN-negative patients showed decreased CSF levels of progranulin (serum levels in GRN-negative patients were normal). The decreased CSF progranulin levels were unrelated to patients' increased CSF levels of total tau, possibly indicating different destructive neuronal processes within FTD neurodegeneration. The patient with the novel GRN missense variant (c.1117C>T, p.P373S) showed substantially decreased CSF levels of progranulin, comparable to the 2 patients with GRN LoF mutations, suggesting a pathogenic effect of this missense variant. Our results indicate that central nervous progranulin reduction is not restricted to the relatively rare cases of FTD caused by GRN LoF mutations, but also contributes to the more common GRN-negative forms of FTD. Central nervous progranulin reduction might reflect a partially distinct pathogenic mechanism underlying FTD neurodegeneration and is not directly linked to tau alterations. © 2016 S. Karger AG, Basel.
Assessment of network inference methods: how to cope with an underdetermined problem.

Directory of Open Access Journals (Sweden)

Caroline Siegenthaler

Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.
Reconstructing Genetic Regulatory Networks Using Two-Step Algorithms with the Differential Equation Models of Neural Networks.

Science.gov (United States)

Chen, Chi-Kan

2017-07-26

The identification of genetic regulatory networks (GRNs) provides insights into complex cellular processes. A class of recurrent neural networks (RNNs) captures the dynamics of GRN. Algorithms combining the RNN and machine learning schemes were proposed to reconstruct small-scale GRNs using gene expression time series. We present new GRN reconstruction methods with neural networks. The RNN is extended to a class of recurrent multilayer perceptrons (RMLPs) with latent nodes. Our methods contain two steps: the edge rank assignment step and the network construction step. The former assigns ranks to all possible edges by a recursive procedure based on the estimated weights of wires of RNN/RMLP (RE RNN /RE RMLP ), and the latter constructs a network consisting of top-ranked edges under which the optimized RNN simulates the gene expression time series. The particle swarm optimization (PSO) is applied to optimize the parameters of RNNs and RMLPs in a two-step algorithm. The proposed RE RNN -RNN and RE RMLP -RNN algorithms are tested on synthetic and experimental gene expression time series of small GRNs of about 10 genes. The experimental time series are from the studies of yeast cell cycle regulated genes and E. coli DNA repair genes. The unstable estimation of RNN using experimental time series having limited data points can lead to fairly arbitrary predicted GRNs. Our methods incorporate RNN and RMLP into a two-step structure learning procedure. Results show that the RE RMLP using the RMLP with a suitable number of latent nodes to reduce the parameter dimension often result in more accurate edge ranks than the RE RNN using the regularized RNN on short simulated time series. Combining by a weighted majority voting rule the networks derived by the RE RMLP -RNN using different numbers of latent nodes in step one to infer the GRN, the method performs consistently and outperforms published algorithms for GRN reconstruction on most benchmark time series. The framework of two
A Learning Algorithm for Multimodal Grammar Inference.

Science.gov (United States)

D'Ulizia, A; Ferri, F; Grifoni, P

2011-12-01

The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

Energy Technology Data Exchange (ETDEWEB)

Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)

2014-05-20

Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

International Nuclear Information System (INIS)

Santra, Tapesh

2014-01-01

Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
Congested Link Inference Algorithms in Dynamic Routing IP Network

Directory of Open Access Journals (Sweden)

Yu Chen

2017-01-01

Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.
A canonical correlation analysis-based dynamic bayesian network prior to infer gene regulatory networks from multiple types of biological data.

Science.gov (United States)

Baur, Brittany; Bozdag, Serdar

2015-04-01

One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
Efficient algorithms for conditional independence inference

Czech Academy of Sciences Publication Activity Database

Bouckaert, R.; Hemmecke, R.; Lindner, S.; Studený, Milan

2010-01-01

Roč. 11, č. 1 (2010), s. 3453-3479 ISSN 1532-4435 R&D Projects: GA ČR GA201/08/0539; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : conditional independence inference * linear programming approach Subject RIV: BA - General Mathematics Impact factor: 2.949, year: 2010 http://library.utia.cas.cz/separaty/2010/MTR/studeny-efficient algorithms for conditional independence inference.pdf
The unexpected co-occurrence of GRN and MAPT p.A152T in Basque families: Clinical and pathological characteristics.

Directory of Open Access Journals (Sweden)

Fermin Moreno

Full Text Available The co-occurrence of the c.709-1G>A GRN mutation and the p.A152T MAPT variant has been identified in 18 Basque families affected by frontotemporal dementia (FTD. We aimed to investigate the influence of the p.A152T MAPT variant on the clinical and neuropathological features of these Basque GRN families.We compared clinical characteristics of 14 patients who carried the c.709-1G>A GRN mutation (GRN+/A152T- with 21 patients who carried both the c.709-1G>A GRN mutation and the p.A152T MAPT variant (GRN+/A152T+. Neuropsychological data (n = 17 and plasma progranulin levels (n = 23 were compared between groups, and 7 subjects underwent neuropathological studies. We genotyped six short tandem repeat markers in the two largest families. By the analysis of linkage disequilibrium decay in the haplotype block we estimated the time when the first ancestor to carry both genetic variants emerged. GRN+/A152T+ and GRN+/A152T- patients shared similar clinical and neuropsychological features and plasma progranulin levels. All were diagnosed with an FTD disorder, including behavioral variant FTD or non fluent / agrammatic variant primary progressive aphasia, and shared a similar pattern of neuropsychological deficits, predominantly in executive function, memory, and language. All seven participants with available brain autopsies (6 GRN+/A152T+, 1 GRN+/A152T- showed frontotemporal lobar degeneration with TDP-43 inclusions (type A classification, which is characteristic of GRN carriers. Additionally, all seven showed mild to moderate tau inclusion burden: five cases lacked β-amyloid pathology and two cases had Alzheimer's pathology. The co-occurrence of both genes within one individual is recent, with the birth of the first GRN+/A152T+ individual estimated to be within the last 50 generations (95% probability.In our sample, the p.A152T MAPT variant does not appear to show a discernible influence on the clinical phenotype of GRN carriers. Whether p.A152T confers a
Image-Data Compression Using Edge-Optimizing Algorithm for WFA Inference.

Science.gov (United States)

Culik, Karel II; Kari, Jarkko

1994-01-01

Presents an inference algorithm that produces a weighted finite automata (WFA), in particular, the grayness functions of graytone images. Image-data compression results based on the new inference algorithm produces a WFA with a relatively small number of edges. Image-data compression results alone and in combination with wavelets are discussed.…
Inference algorithms and learning theory for Bayesian sparse factor analysis

International Nuclear Information System (INIS)

Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

2009-01-01

Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Inference algorithms and learning theory for Bayesian sparse factor analysis

Energy Technology Data Exchange (ETDEWEB)

Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

2009-12-01

Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.

Science.gov (United States)

Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming

2014-11-30

Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .

Neural model of gene regulatory network: a survey on supportive meta-heuristics.

Science.gov (United States)

Biswas, Surama; Acharyya, Sriyankar

2016-06-01

Gene regulatory network (GRN) is produced as a result of regulatory interactions between different genes through their coded proteins in cellular context. Having immense importance in disease detection and drug finding, GRN has been modelled through various mathematical and computational schemes and reported in survey articles. Neural and neuro-fuzzy models have been the focus of attraction in bioinformatics. Predominant use of meta-heuristic algorithms in training neural models has proved its excellence. Considering these facts, this paper is organized to survey neural modelling schemes of GRN and the efficacy of meta-heuristic algorithms towards parameter learning (i.e. weighting connections) within the model. This survey paper renders two different structure-related approaches to infer GRN which are global structure approach and substructure approach. It also describes two neural modelling schemes, such as artificial neural network/recurrent neural network based modelling and neuro-fuzzy modelling. The meta-heuristic algorithms applied so far to learn the structure and parameters of neutrally modelled GRN have been reviewed here.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

Directory of Open Access Journals (Sweden)

Joeri Ruyssinck

Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
Missense mutation in GRN gene affecting RNA splicing and plasma progranulin level in a family affected by frontotemporal lobar degeneration.

Science.gov (United States)

Luzzi, Simona; Colleoni, Lara; Corbetta, Paola; Baldinelli, Sara; Fiori, Chiara; Girelli, Francesca; Silvestrini, Mauro; Caroppo, Paola; Giaccone, Giorgio; Tagliavini, Fabrizio; Rossi, Giacomina

2017-06-01

Gene coding for progranulin, GRN, is a major gene linked to frontotemporal lobar degeneration. While most of pathogenic GRN mutations are null mutations leading to haploinsufficiency, GRN missense mutations do not have an obvious pathogenicity, and only a few have been revealed to act through different pathogenetic mechanisms, such as cytoplasmic missorting, protein degradation, and abnormal cleavage by elastase. The aim of this study was to disclose the pathogenetic mechanisms of the GRN A199V missense mutation, which was previously reported not to alter physiological progranulin features but was associated with a reduced plasma progranulin level. After investigating the family pedigree, we performed genetic and biochemical analysis on its members and performed RNA expression studies. We found that the mutation segregates with the disease and discovered that its pathogenic feature is the alteration of GRN mRNA splicing, actually leading to haploinsufficiency. Thus, when facing with a missense GRN mutation, its pathogenetic effects should be investigated, especially if associated with low plasma progranulin levels, to determine its nature of either benign polymorphism or pathogenic mutation. Copyright © 2017 Elsevier Inc. All rights reserved.
Algorithms for MDC-Based Multi-locus Phylogeny Inference

Science.gov (United States)

Yu, Yun; Warnow, Tandy; Nakhleh, Luay

One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is minimize deep coalescence, or MDC. Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this paper, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. Finally, we study the performance of these methods in coalescent-based computer simulations.
A novel frameshift GRN mutation results in frontotemporal lobar degeneration with a distinct clinical phenotype in two siblings: case report and literature review.

Science.gov (United States)

Hosaka, Takashi; Ishii, Kazuhiro; Miura, Takeshi; Mezaki, Naomi; Kasuga, Kensaku; Ikeuchi, Takeshi; Tamaoka, Akira

2017-09-15

Progranulin gene (GRN) mutations are major causes of frontotemporal lobar degeneration. To date, 68 pathogenic GRN mutations have been identified. However, very few of these mutations have been reported in Asians. Moreover, some GRN mutations manifest with familial phenotypic heterogeneity. Here, we present a novel GRN mutation resulting in frontotemporal lobar degeneration with a distinct clinical phenotype, and we review reports of GRN mutations associated with familial phenotypic heterogeneity. We describe the case of a 74-year-old woman with left frontotemporal lobe atrophy who presented with progressive anarthria and non-fluent aphasia. Her brother had been diagnosed with corticobasal syndrome (CBS) with right-hand limb-kinetic apraxia, aphasia, and a similar pattern of brain atrophy. Laboratory blood examinations did not reveal abnormalities that could have caused cognitive dysfunction. In the cerebrospinal fluid, cell counts and protein concentrations were within normal ranges, and concentrations of tau protein and phosphorylated tau protein were also normal. Since similar familial cases due to mutation of GRN and microtubule-associated protein tau gene (MAPT) were reported, we performed genetic analysis. No pathological mutations of MAPT were identified, but we identified a novel GRN frameshift mutation (c.1118_1119delCCinsG: p.Pro373ArgX37) that resulted in progranulin haploinsufficiency. This is the first report of a GRN mutation associated with familial phenotypic heterogeneity in Japan. Literature review of GRN mutations associated with familial phenotypic heterogeneity revealed no tendency of mutation sites. The role of progranulin has been reported in this and other neurodegenerative diseases, and the analysis of GRN mutations may lead to the discovery of a new therapeutic target.
A novel gene network inference algorithm using predictive minimum description length approach.

Science.gov (United States)

Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

2010-05-28

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the
Reconciling taxonomy and phylogenetic inference: formalism and algorithms for describing discord and inferring taxonomic roots

Directory of Open Access Journals (Sweden)

Matsen Frederick A

2012-05-01

Full Text Available Abstract Background Although taxonomy is often used informally to evaluate the results of phylogenetic inference and the root of phylogenetic trees, algorithmic methods to do so are lacking. Results In this paper we formalize these procedures and develop algorithms to solve the relevant problems. In particular, we introduce a new algorithm that solves a "subcoloring" problem to express the difference between a taxonomy and a phylogeny at a given rank. This algorithm improves upon the current best algorithm in terms of asymptotic complexity for the parameter regime of interest; we also describe a branch-and-bound algorithm that saves orders of magnitude in computation on real data sets. We also develop a formalism and an algorithm for rooting phylogenetic trees according to a taxonomy. Conclusions The algorithms in this paper, and the associated freely-available software, will help biologists better use and understand taxonomically labeled phylogenetic trees.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

Science.gov (United States)

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high
Trehalose upregulates progranulin expression in human and mouse models of GRN haploinsufficiency: a novel therapeutic lead to treat frontotemporal dementia.

Science.gov (United States)

Holler, Christopher J; Taylor, Georgia; McEachin, Zachary T; Deng, Qiudong; Watkins, William J; Hudson, Kathryn; Easley, Charles A; Hu, William T; Hales, Chadwick M; Rossoll, Wilfried; Bassell, Gary J; Kukar, Thomas

2016-06-24

Progranulin (PGRN) is a secreted growth factor important for neuronal survival and may do so, in part, by regulating lysosome homeostasis. Mutations in the PGRN gene (GRN) are a common cause of frontotemporal lobar degeneration (FTLD) and lead to disease through PGRN haploinsufficiency. Additionally, complete loss of PGRN in humans leads to neuronal ceroid lipofuscinosis (NCL), a lysosomal storage disease. Importantly, Grn-/- mouse models recapitulate pathogenic lysosomal features of NCL. Further, GRN variants that decrease PGRN expression increase the risk of developing Alzheimer's disease (AD) and Parkinson's disease (PD). Together these findings demonstrate that insufficient PGRN predisposes neurons to degeneration. Therefore, compounds that increase PGRN levels are potential therapeutics for multiple neurodegenerative diseases. Here, we performed a cell-based screen of a library of known autophagy-lysosome modulators and identified multiple novel activators of a human GRN promoter reporter including several common mTOR inhibitors and an mTOR-independent activator of autophagy, trehalose. Secondary cellular screens identified trehalose, a natural disaccharide, as the most promising lead compound because it increased endogenous PGRN in all cell lines tested and has multiple reported neuroprotective properties. Trehalose dose-dependently increased GRN mRNA as well as intracellular and secreted PGRN in both mouse and human cell lines and this effect was independent of the transcription factor EB (TFEB). Moreover, trehalose rescued PGRN deficiency in human fibroblasts and neurons derived from induced pluripotent stem cells (iPSCs) generated from GRN mutation carriers. Finally, oral administration of trehalose to Grn haploinsufficient mice significantly increased PGRN expression in the brain. This work reports several novel autophagy-lysosome modulators that enhance PGRN expression and identifies trehalose as a promising therapeutic for raising PGRN levels to treat
In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.

Science.gov (United States)

Audain, Enrique; Uszkoreit, Julian; Sachsenberg, Timo; Pfeuffer, Julianus; Liang, Xiao; Hermjakob, Henning; Sanchez, Aniel; Eisenacher, Martin; Reinert, Knut; Tabb, David L; Kohlbacher, Oliver; Perez-Riverol, Yasset

2017-01-06

In mass spectrometry-based shotgun proteomics, protein identifications are usually the desired result. However, most of the analytical methods are based on the identification of reliable peptides and not the direct identification of intact proteins. Thus, assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is a critical step in proteomics research. Currently, different protein inference algorithms and tools are available for the proteomics community. Here, we evaluated five software tools for protein inference (PIA, ProteinProphet, Fido, ProteinLP, MSBayesPro) using three popular database search engines: Mascot, X!Tandem, and MS-GF+. All the algorithms were evaluated using a highly customizable KNIME workflow using four different public datasets with varying complexities (different sample preparation, species and analytical instruments). We defined a set of quality control metrics to evaluate the performance of each combination of search engines, protein inference algorithm, and parameters on each dataset. We show that the results for complex samples vary not only regarding the actual numbers of reported protein groups but also concerning the actual composition of groups. Furthermore, the robustness of reported proteins when using databases of differing complexities is strongly dependant on the applied inference algorithm. Finally, merging the identifications of multiple search engines does not necessarily increase the number of reported proteins, but does increase the number of peptides per protein and thus can generally be recommended. Protein inference is one of the major challenges in MS-based proteomics nowadays. Currently, there are a vast number of protein inference algorithms and implementations available for the proteomics community. Protein assembly impacts in the final results of the research, the quantitation values and the final claims in the research manuscript. Even though protein
Comparison of evolutionary algorithms in gene regulatory network model inference.

LENUS (Irish Health Repository)

2010-01-01

ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.
Inferring microbial interaction networks from metagenomic data using SgLV-EKF algorithm.

Science.gov (United States)

Alshawaqfeh, Mustafa; Serpedin, Erchin; Younes, Ahmad Bani

2017-03-27

Inferring the microbial interaction networks (MINs) and modeling their dynamics are critical in understanding the mechanisms of the bacterial ecosystem and designing antibiotic and/or probiotic therapies. Recently, several approaches were proposed to infer MINs using the generalized Lotka-Volterra (gLV) model. Main drawbacks of these models include the fact that these models only consider the measurement noise without taking into consideration the uncertainties in the underlying dynamics. Furthermore, inferring the MIN is characterized by the limited number of observations and nonlinearity in the regulatory mechanisms. Therefore, novel estimation techniques are needed to address these challenges. This work proposes SgLV-EKF: a stochastic gLV model that adopts the extended Kalman filter (EKF) algorithm to model the MIN dynamics. In particular, SgLV-EKF employs a stochastic modeling of the MIN by adding a noise term to the dynamical model to compensate for modeling uncertainties. This stochastic modeling is more realistic than the conventional gLV model which assumes that the MIN dynamics are perfectly governed by the gLV equations. After specifying the stochastic model structure, we propose the EKF to estimate the MIN. SgLV-EKF was compared with two similarity-based algorithms, one algorithm from the integral-based family and two regression-based algorithms, in terms of the achieved performance on two synthetic data-sets and two real data-sets. The first data-set models the randomness in measurement data, whereas, the second data-set incorporates uncertainties in the underlying dynamics. The real data-sets are provided by a recent study pertaining to an antibiotic-mediated Clostridium difficile infection. The experimental results demonstrate that SgLV-EKF outperforms the alternative methods in terms of robustness to measurement noise, modeling errors, and tracking the dynamics of the MIN. Performance analysis demonstrates that the proposed SgLV-EKF algorithm
Identification of Fuzzy Inference Systems by Means of a Multiobjective Opposition-Based Space Search Algorithm

Directory of Open Access Journals (Sweden)

Wei Huang

2013-01-01

Full Text Available We introduce a new category of fuzzy inference systems with the aid of a multiobjective opposition-based space search algorithm (MOSSA. The proposed MOSSA is essentially a multiobjective space search algorithm improved by using an opposition-based learning that employs a so-called opposite numbers mechanism to speed up the convergence of the optimization algorithm. In the identification of fuzzy inference system, the MOSSA is exploited to carry out the parametric identification of the fuzzy model as well as to realize its structural identification. Experimental results demonstrate the effectiveness of the proposed fuzzy models.
Efficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation

Science.gov (United States)

Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad

2016-05-01

Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
An Improved Binary Differential Evolution Algorithm to Infer Tumor Phylogenetic Trees.

Science.gov (United States)

Liang, Ying; Liao, Bo; Zhu, Wen

2017-01-01

Tumourigenesis is a mutation accumulation process, which is likely to start with a mutated founder cell. The evolutionary nature of tumor development makes phylogenetic models suitable for inferring tumor evolution through genetic variation data. Copy number variation (CNV) is the major genetic marker of the genome with more genes, disease loci, and functional elements involved. Fluorescence in situ hybridization (FISH) accurately measures multiple gene copy number of hundreds of single cells. We propose an improved binary differential evolution algorithm, BDEP, to infer tumor phylogenetic tree based on FISH platform. The topology analysis of tumor progression tree shows that the pathway of tumor subcell expansion varies greatly during different stages of tumor formation. And the classification experiment shows that tree-based features are better than data-based features in distinguishing tumor. The constructed phylogenetic trees have great performance in characterizing tumor development process, which outperforms other similar algorithms.
White matter hyperintensities are seen only in GRN mutation carriers in the GENFI cohort

NARCIS (Netherlands)

Sudre, C.H. (Carole H.); M. Bocchetta (Martina); D.M. Cash (David M); D.L. Thomas (David L); Woollacott, I. (Ione); Dick, K.M. (Katrina M.); J.C. van Swieten (John); B. Borroni (Barbara); D. Galimberti (Daniela); M. Masellis (Mario); M.C. Tartaglia (Maria Carmela); J.B. Rowe (James); M.J. Graff (Maud J.L.); F. Tagliavini (Fabrizio); G.B. Frisoni (Giovanni B.); R. Laforce (Robert); E. Finger (Elizabeth); A. De Mendonça (Alexandre); S. Sorbi (Sandro); S. Ourselin (Sebastien); M.J. Cardoso (Manuel Jorge); J.D. Rohrer (Jonathan D)

2017-01-01

textabstractGenetic frontotemporal dementia is most commonly caused by mutations in the progranulin (GRN), microtubule-associated protein tau (MAPT) and chromosome 9 open reading frame 72 (C9orf72) genes. Previous small studies have reported the presence of cerebral white matter hyperintensities
Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

Science.gov (United States)

Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

2017-10-06

Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
CSF protein changes associated with hippocampal sclerosis risk gene variants highlight impact of GRN/PGRN.

Science.gov (United States)

Fardo, David W; Katsumata, Yuriko; Kauwe, John S K; Deming, Yuetiva; Harari, Oscar; Cruchaga, Carlos; Nelson, Peter T

2017-04-01

Hippocampal sclerosis of aging (HS-Aging) is a common cause of dementia in older adults. We tested the variability in cerebrospinal fluid (CSF) proteins associated with previously identified HS-Aging risk single nucleotide polymorphisms (SNPs). Alzheimer's Disease Neuroimaging Initiative cohort (ADNI; n=237) data, combining both multiplexed proteomics CSF and genotype data, were used to assess the association between CSF analytes and risk SNPs in four genes (SNPs): GRN (rs5848), TMEM106B (rs1990622), ABCC9 (rs704180), and KCNMB2 (rs9637454). For controls, non-HS-Aging SNPs in APOE (rs429358/rs7412) and MAPT (rs8070723) were also analyzed against Aβ1-42 and total tau CSF analytes. The GRN risk SNP (rs5848) status correlated with variation in CSF proteins, with the risk allele (T) associated with increased levels of AXL Receptor Tyrosine Kinase (AXL), TNF-Related Apoptosis-Inducing Ligand Receptor 3 (TRAIL-R3), Vascular Cell Adhesion Molecule-1 (VCAM-1) and clusterin (CLU) (all p<0.05 after Bonferroni correction). The TRAIL-R3 correlation was significant in meta-analysis with an additional dataset (p=5.05×10 -5 ). Further, the rs5848 SNP status was associated with increased CSF tau protein - a marker of neurodegeneration (p=0.015). These data are remarkable since this GRN SNP has been found to be a risk factor for multiple types of dementia-related brain pathologies. Copyright © 2017 Elsevier Inc. All rights reserved.
An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

KAUST Repository

Bayer, Christian

2016-01-06

In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem of approximating the reaction coefficients based on discretely observed data. To this end, we introduce an efficient two-phase algorithm in which the first phase is deterministic and it is intended to provide a starting point for the second phase which is the Monte Carlo EM Algorithm.
An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks

KAUST Repository

Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

2016-01-01

In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem

A neuro-fuzzy inference system tuned by particle swarm optimization algorithm for sensor monitoring

Energy Technology Data Exchange (ETDEWEB)

Oliveira, Mauro Vitor de [Instituto de Engenharia Nuclear (IEN), Rio de Janeiro, RJ (Brazil). Div. de Instrumentacao e Confiabilidade Humana]. E-mail: mvitor@ien.gov.br; Schirru, Roberto [Universidade Federal, Rio de Janeiro, RJ (Brazil). Coordenacao dos Programas de Pos-graduacao de Engenharia. Lab. de Monitoracao de Processos

2005-07-01

A neuro-fuzzy inference system (ANFIS) tuned by particle swarm optimization (PSO) algorithm has been developed for monitor the relevant sensor in a nuclear plant using the information of other sensors. The antecedent parameters of the ANFIS that estimates the relevant sensor signal are optimized by a PSO algorithm and consequent parameters use a least-squares algorithm. The proposed sensor-monitoring algorithm was demonstrated through the estimation of the nuclear power value in a pressurized water reactor using as input to the ANFIS six other correlated signals. The obtained results are compared to two similar ANFIS using one gradient descendent (GD) and other genetic algorithm (GA), as antecedent parameters training algorithm. (author)
A neuro-fuzzy inference system tuned by particle swarm optimization algorithm for sensor monitoring

International Nuclear Information System (INIS)

Oliveira, Mauro Vitor de; Schirru, Roberto

2005-01-01

A neuro-fuzzy inference system (ANFIS) tuned by particle swarm optimization (PSO) algorithm has been developed for monitor the relevant sensor in a nuclear plant using the information of other sensors. The antecedent parameters of the ANFIS that estimates the relevant sensor signal are optimized by a PSO algorithm and consequent parameters use a least-squares algorithm. The proposed sensor-monitoring algorithm was demonstrated through the estimation of the nuclear power value in a pressurized water reactor using as input to the ANFIS six other correlated signals. The obtained results are compared to two similar ANFIS using one gradient descendent (GD) and other genetic algorithm (GA), as antecedent parameters training algorithm. (author)
A comparison of algorithms for inference and learning in probabilistic graphical models.

Science.gov (United States)

Frey, Brendan J; Jojic, Nebojsa

2005-09-01

Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Reveal, A General Reverse Engineering Algorithm for Inference of Genetic Network Architectures

Science.gov (United States)

Liang, Shoudan; Fuhrman, Stefanie; Somogyi, Roland

1998-01-01

Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.
Profiling of Ubiquitination Pathway Genes in Peripheral Cells from Patients with Frontotemporal Dementia due to C9ORF72 and GRN Mutations

Directory of Open Access Journals (Sweden)

Maria Serpente

2015-01-01

Full Text Available We analysed the expression levels of 84 key genes involved in the regulated degradation of cellular protein by the ubiquitin-proteasome system in peripheral cells from patients with frontotemporal dementia (FTD due to C9ORF72 and GRN mutations, as compared with sporadic FTD and age-matched controls. A SABiosciences PCR array was used to investigate the transcription profile in a discovery population consisting of six patients each in C9ORF72, GRN, sporadic FTD and age-matched control groups. A generalized down-regulation of gene expression compared with controls was observed in C9ORF72 expansion carriers and sporadic FTD patients. In particular, in both groups, four genes, UBE2I, UBE2Q1, UBE2E1 and UBE2N, were down-regulated at a statistically significant (p < 0.05 level. All of them encode for members of the E2 ubiquitin-conjugating enzyme family. In GRN mutation carriers, no statistically significant deregulation of ubiquitination pathway genes was observed, except for the UBE2Z gene, which displays E2 ubiquitin conjugating enzyme activity, and was found to be statistically significant up-regulated (p = 0.006. These preliminary results suggest that the proteasomal degradation pathway plays a role in the pathogenesis of FTD associated with TDP-43 pathology, although different proteins are altered in carriers of GRN mutations as compared with carriers of the C9ORF72 expansion.
Defining the association of TMEM106B variants among frontotemporal lobar degeneration patients with GRN mutations and C9orf72 repeat expansions.

Science.gov (United States)

Lattante, Serena; Le Ber, Isabelle; Galimberti, Daniela; Serpente, Maria; Rivaud-Péchoux, Sophie; Camuzat, Agnès; Clot, Fabienne; Fenoglio, Chiara; Scarpini, Elio; Brice, Alexis; Kabashi, Edor

2014-11-01

TMEM106B was identified as a risk factor for frontotemporal lobar degeneration (FTD) with TAR DNA-binding protein 43 kDa inclusions. It has been reported that variants in this gene are genetic modifiers of the disease and that this association is stronger in patients carrying a GRN mutation or a pathogenic expansion in chromosome 9 open reading frame 72 (C9orf72) gene. Here, we investigated the contribution of TMEM106B polymorphisms in cohorts of FTD and FTD with amyotrophic lateral sclerosis patients from France and Italy. Patients carrying the C9orf72 expansion (n = 145) and patients with GRN mutations (n = 76) were compared with a group of FTD patients (n = 384) negative for mutations and to a group of healthy controls (n = 552). In our cohorts, the presence of the C9orf72 expansion did not correlate with TMEM106B genotypes but the association was very strong in individuals with pathogenic GRN mutations (p = 9.54 × 10(-6)). Our data suggest that TMEM106B genotypes differ in FTD patient cohorts and strengthen the protective role of TMEM106B in GRN carriers. Further studies are needed to determine whether TMEM106B polymorphisms are associated with other genetic causes for FTD, including C9orf72 repeat expansions. Copyright © 2014 Elsevier Inc. All rights reserved.
Reconstructing gene regulatory networks from knock-out data using Gaussian Noise Model and Pearson Correlation Coefficient.

Science.gov (United States)

Mohamed Salleh, Faridah Hani; Arif, Shereena Mohd; Zainudin, Suhaila; Firdaus-Raih, Mohd

2015-12-01

A gene regulatory network (GRN) is a large and complex network consisting of interacting elements that, over time, affect each other's state. The dynamics of complex gene regulatory processes are difficult to understand using intuitive approaches alone. To overcome this problem, we propose an algorithm for inferring the regulatory interactions from knock-out data using a Gaussian model combines with Pearson Correlation Coefficient (PCC). There are several problems relating to GRN construction that have been outlined in this paper. We demonstrated the ability of our proposed method to (1) predict the presence of regulatory interactions between genes, (2) their directionality and (3) their states (activation or suppression). The algorithm was applied to network sizes of 10 and 50 genes from DREAM3 datasets and network sizes of 10 from DREAM4 datasets. The predicted networks were evaluated based on AUROC and AUPR. We discovered that high false positive values were generated by our GRN prediction methods because the indirect regulations have been wrongly predicted as true relationships. We achieved satisfactory results as the majority of sub-networks achieved AUROC values above 0.5. Copyright © 2015 Elsevier Ltd. All rights reserved.
Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles.

Science.gov (United States)

Yu, Yun; Warnow, Tandy; Nakhleh, Luay

2011-11-01

One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is Minimize Deep Coalescence (MDC). Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this article, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. In addition, we devise MDC-based algorithms for cases when multiple alleles per species may be sampled. We study the performance of these methods in coalescent-based computer simulations.
Estimation of tool wear length in finish milling using a fuzzy inference algorithm

Science.gov (United States)

Ko, Tae Jo; Cho, Dong Woo

1993-10-01

The geometric accuracy and surface roughness are mainly affected by the flank wear at the minor cutting edge in finish machining. A fuzzy estimator obtained by a fuzzy inference algorithm with a max-min composition rule to evaluate the minor flank wear length in finish milling is introduced. The features sensitive to minor flank wear are extracted from the dispersion analysis of a time series AR model of the feed directional acceleration of the spindle housing. Linguistic rules for fuzzy estimation are constructed using these features, and then fuzzy inferences are carried out with test data sets under various cutting conditions. The proposed system turns out to be effective for estimating minor flank wear length, and its mean error is less than 12%.
Use of genetic algorithms in operations management. Part II - Results.

OpenAIRE

Stockton, David; Quinn, L. (Liam); Khalil, R. A. (Riham A.)

2004-01-01

The insight gained into the relationship between genetic algorithm (GA) structure and optimisation performance, through the research reported in this paper, provided the knowledge to integrate GAs with discrete event simulation which formed the output from IMI EPSRC Project GR/N05871 ‘Responsive Design and Operation of Flexible Machining Lines’ rated by EPSRC as “Tending to Internationally Leading” where industrial partners included , Unipart Group Ltd and Nigel.Shir...
Grammatical inference algorithms, routines and applications

CERN Document Server

Wieczorek, Wojciech

2017-01-01

This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.
Identifying elementary iterated systems through algorithmic inference: The Cantor set example

Energy Technology Data Exchange (ETDEWEB)

Apolloni, Bruno [Dipartimento di Scienze dell' Informazione, Universita degli Studi di Milano, Via Comelico 39/41, 20135 Milan (Italy)]. E-mail: apolloni@dsi.unimi.it; Bassis, Simone [Dipartimento di Scienze dell' Informazione, Universita degli Studi di Milano, Via Comelico 39/41, 20135 Milan (Italy)]. E-mail: bassis@dsi.unimi.it

2006-10-15

We come back to the old problem of fractal identification within the new framework of algorithmic Inference. The key points are: (i) to identify sufficient statistics to be put in connection with the unknown values of the fractal parameters, and (ii) to manage the timing of the iterated process through spatial statistics. We fill these tasks successfully with the Cantor sets. We are able to compute confidence intervals for both the scaling parameter {theta} and the iteration number n at which we are observing a set. We both check numerically the coverage of these intervals and delineate a general strategy for affording more complex iterated systems.
A Combined Methodology of Adaptive Neuro-Fuzzy Inference System and Genetic Algorithm for Short-term Energy Forecasting

Directory of Open Access Journals (Sweden)

KAMPOUROPOULOS, K.

2014-02-01

Full Text Available This document presents an energy forecast methodology using Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithms (GA. The GA has been used for the selection of the training inputs of the ANFIS in order to minimize the training result error. The presented algorithm has been installed and it is being operating in an automotive manufacturing plant. It periodically communicates with the plant to obtain new information and update the database in order to improve its training results. Finally the obtained results of the algorithm are used in order to provide a short-term load forecasting for the different modeled consumption processes.
Statistical Physics, Optimization, Inference, and Message-Passing Algorithms : Lecture Notes of the Les Houches School of Physics : Special Issue, October 2013

CERN Document Server

Ricci-Tersenghi, Federico; Zdeborova, Lenka; Zecchina, Riccardo; Tramel, Eric W; Cugliandolo, Leticia F

2015-01-01

This book contains a collection of the presentations that were given in October 2013 at the Les Houches Autumn School on statistical physics, optimization, inference, and message-passing algorithms. In the last decade, there has been increasing convergence of interest and methods between theoretical physics and fields as diverse as probability, machine learning, optimization, and inference problems. In particular, much theoretical and applied work in statistical physics and computer science has relied on the use of message-passing algorithms and their connection to the statistical physics of glasses and spin glasses. For example, both the replica and cavity methods have led to recent advances in compressed sensing, sparse estimation, and random constraint satisfaction, to name a few. This book’s detailed pedagogical lectures on statistical inference, computational complexity, the replica and cavity methods, and belief propagation are aimed particularly at PhD students, post-docs, and young researchers desir...
Single-cell and coupled GRN models of cell patterning in the Arabidopsis thaliana root stem cell niche

Directory of Open Access Journals (Sweden)

Alvarez-Buylla Elena R

2010-10-01

Full Text Available Abstract Background Recent experimental work has uncovered some of the genetic components required to maintain the Arabidopsis thaliana root stem cell niche (SCN and its structure. Two main pathways are involved. One pathway depends on the genes SHORTROOT and SCARECROW and the other depends on the PLETHORA genes, which have been proposed to constitute the auxin readouts. Recent evidence suggests that a regulatory circuit, composed of WOX5 and CLE40, also contributes to the SCN maintenance. Yet, we still do not understand how the niche is dynamically maintained and patterned or if the uncovered molecular components are sufficient to recover the observed gene expression configurations that characterize the cell types within the root SCN. Mathematical and computational tools have proven useful in understanding the dynamics of cell differentiation. Hence, to further explore root SCN patterning, we integrated available experimental data into dynamic Gene Regulatory Network (GRN models and addressed if these are sufficient to attain observed gene expression configurations in the root SCN in a robust and autonomous manner. Results We found that an SCN GRN model based only on experimental data did not reproduce the configurations observed within the root SCN. We developed several alternative GRN models that recover these expected stable gene configurations. Such models incorporate a few additional components and interactions in addition to those that have been uncovered. The recovered configurations are stable to perturbations, and the models are able to recover the observed gene expression profiles of almost all the mutants described so far. However, the robustness of the postulated GRNs is not as high as that of other previously studied networks. Conclusions These models are the first published approximations for a dynamic mechanism of the A. thaliana root SCN cellular pattering. Our model is useful to formally show that the data now available are not
Potential genetic modifiers of disease risk and age at onset in patients with frontotemporal lobar degeneration and GRN mutations: a genome-wide association study.

Science.gov (United States)

Pottier, Cyril; Zhou, Xiaolai; Perkerson, Ralph B; Baker, Matt; Jenkins, Gregory D; Serie, Daniel J; Ghidoni, Roberta; Benussi, Luisa; Binetti, Giuliano; López de Munain, Adolfo; Zulaica, Miren; Moreno, Fermin; Le Ber, Isabelle; Pasquier, Florence; Hannequin, Didier; Sánchez-Valle, Raquel; Antonell, Anna; Lladó, Albert; Parsons, Tammee M; Finch, NiCole A; Finger, Elizabeth C; Lippa, Carol F; Huey, Edward D; Neumann, Manuela; Heutink, Peter; Synofzik, Matthis; Wilke, Carlo; Rissman, Robert A; Slawek, Jaroslaw; Sitek, Emilia; Johannsen, Peter; Nielsen, Jørgen E; Ren, Yingxue; van Blitterswijk, Marka; DeJesus-Hernandez, Mariely; Christopher, Elizabeth; Murray, Melissa E; Bieniek, Kevin F; Evers, Bret M; Ferrari, Camilla; Rollinson, Sara; Richardson, Anna; Scarpini, Elio; Fumagalli, Giorgio G; Padovani, Alessandro; Hardy, John; Momeni, Parastoo; Ferrari, Raffaele; Frangipane, Francesca; Maletta, Raffaele; Anfossi, Maria; Gallo, Maura; Petrucelli, Leonard; Suh, EunRan; Lopez, Oscar L; Wong, Tsz H; van Rooij, Jeroen G J; Seelaar, Harro; Mead, Simon; Caselli, Richard J; Reiman, Eric M; Noel Sabbagh, Marwan; Kjolby, Mads; Nykjaer, Anders; Karydas, Anna M; Boxer, Adam L; Grinberg, Lea T; Grafman, Jordan; Spina, Salvatore; Oblak, Adrian; Mesulam, M-Marsel; Weintraub, Sandra; Geula, Changiz; Hodges, John R; Piguet, Olivier; Brooks, William S; Irwin, David J; Trojanowski, John Q; Lee, Edward B; Josephs, Keith A; Parisi, Joseph E; Ertekin-Taner, Nilüfer; Knopman, David S; Nacmias, Benedetta; Piaceri, Irene; Bagnoli, Silvia; Sorbi, Sandro; Gearing, Marla; Glass, Jonathan; Beach, Thomas G; Black, Sandra E; Masellis, Mario; Rogaeva, Ekaterina; Vonsattel, Jean-Paul; Honig, Lawrence S; Kofler, Julia; Bruni, Amalia C; Snowden, Julie; Mann, David; Pickering-Brown, Stuart; Diehl-Schmid, Janine; Winkelmann, Juliane; Galimberti, Daniela; Graff, Caroline; Öijerstedt, Linn; Troakes, Claire; Al-Sarraj, Safa; Cruchaga, Carlos; Cairns, Nigel J; Rohrer, Jonathan D; Halliday, Glenda M; Kwok, John B; van Swieten, John C; White, Charles L; Ghetti, Bernardino; Murell, Jill R; Mackenzie, Ian R A; Hsiung, Ging-Yuek R; Borroni, Barbara; Rossi, Giacomina; Tagliavini, Fabrizio; Wszolek, Zbigniew K; Petersen, Ronald C; Bigio, Eileen H; Grossman, Murray; Van Deerlin, Vivianna M; Seeley, William W; Miller, Bruce L; Graff-Radford, Neill R; Boeve, Bradley F; Dickson, Dennis W; Biernacka, Joanna M; Rademakers, Rosa

2018-06-01

Loss-of-function mutations in GRN cause frontotemporal lobar degeneration (FTLD). Patients with GRN mutations present with a uniform subtype of TAR DNA-binding protein 43 (TDP-43) pathology at autopsy (FTLD-TDP type A); however, age at onset and clinical presentation are variable, even within families. We aimed to identify potential genetic modifiers of disease onset and disease risk in GRN mutation carriers. The study was done in three stages: a discovery stage, a replication stage, and a meta-analysis of the discovery and replication data. In the discovery stage, genome-wide logistic and linear regression analyses were done to test the association of genetic variants with disease risk (case or control status) and age at onset in patients with a GRN mutation and controls free of neurodegenerative disorders. Suggestive loci (p<1 × 10 -5 ) were genotyped in a replication cohort of patients and controls, followed by a meta-analysis. The effect of genome-wide significant variants at the GFRA2 locus on expression of GFRA2 was assessed using mRNA expression studies in cerebellar tissue samples from the Mayo Clinic brain bank. The effect of the GFRA2 locus on progranulin concentrations was studied using previously generated ELISA-based expression data. Co-immunoprecipitation experiments in HEK293T cells were done to test for a direct interaction between GFRA2 and progranulin. Individuals were enrolled in the current study between Sept 16, 2014, and Oct 5, 2017. After quality control measures, statistical analyses in the discovery stage included 382 unrelated symptomatic GRN mutation carriers and 1146 controls free of neurodegenerative disorders collected from 34 research centres located in the USA, Canada, Australia, and Europe. In the replication stage, 210 patients (67 symptomatic GRN mutation carriers and 143 patients with FTLD without GRN mutations pathologically confirmed as FTLD-TDP type A) and 1798 controls free of neurodegenerative diseases were recruited
Enhanced gene ranking approaches using modified trace ratio algorithm for gene expression data

Directory of Open Access Journals (Sweden)

Shruti Mishra

Full Text Available Microarray technology enables the understanding and investigation of gene expression levels by analyzing high dimensional datasets that contain few samples. Over time, microarray expression data have been collected for studying the underlying biological mechanisms of disease. One such application for understanding the mechanism is by constructing a gene regulatory network (GRN. One of the foremost key criteria for GRN discovery is gene selection. Choosing a generous set of genes for the structure of the network is highly desirable. For this role, two suitable methods were proposed for selection of appropriate genes. The first approach comprises a gene selection method called Information gain, where the dataset is reformed and fused with another distinct algorithm called Trace Ratio (TR. Our second method is the implementation of our projected modified TR algorithm, where the scoring base for finding weight matrices has been re-designed. Both the methods' efficiency was shown with different classifiers that include variants of the Artificial Neural Network classifier, such as Resilient Propagation, Quick Propagation, Back Propagation, Manhattan Propagation and Radial Basis Function Neural Network and also the Support Vector Machine (SVM classifier. In the study, it was confirmed that both of the proposed methods worked well and offered high accuracy with a lesser number of iterations as compared to the original Trace Ratio algorithm. Keywords: Gene regulatory network, Gene selection, Information gain, Trace ratio, Canonical correlation analysis, Classification
Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.

Science.gov (United States)

Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A

2017-08-07

High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier
Elements of Causal Inference: Foundations and Learning Algorithms

DEFF Research Database (Denmark)

Peters, Jonas Martin; Janzing, Dominik; Schölkopf, Bernhard

A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning......A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning...
Ensemble stacking mitigates biases in inference of synaptic connectivity

Directory of Open Access Journals (Sweden)

Brendan Chambers

2018-03-01

Full Text Available A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches. Mapping the routing of spikes through local circuitry is crucial for understanding neocortical computation. Under appropriate experimental conditions, these maps can be used to infer likely patterns of synaptic recruitment, linking activity to underlying anatomical connections. Such inferences help to reveal the synaptic implementation of population dynamics and computation. We compare a number of standard functional measures to infer underlying connectivity. We find that regularization impacts measures

Parametric inference for biological sequence analysis.

Science.gov (United States)

Pachter, Lior; Sturmfels, Bernd

2004-11-16

One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
Progranulin plasma levels predict the presence of GRN mutations in asymptomatic subjects and do not correlate with brain atrophy: Results from the GENFI study

NARCIS (Netherlands)

D. Galimberti (Daniela); Fumagalli, G.G. (Giorgio G.); C. Fenoglio (Chiara); Cioffi, S.M.G. (Sara M.G.); A. Arighi (Andrea); M. Serpente (Maria); B. Borroni (Barbara); A. Padovani (Alessandro); F. Tagliavini (Fabrizio); M. Masellis (Mario); M.C. Tartaglia (Maria Carmela); J.C. van Swieten (John); L.H.H. Meeter (Lieke H.H.); C. Graff (Caroline); A. De Mendonça (Alexandre); M. Bocchetta (Martina); J.D. Rohrer (Jonathan Daniel); Scarpini, E. (Elio)

2017-01-01

textabstractWe investigated whether progranulin plasma levels are predictors of the presence of progranulin gene (GRN) null mutations or of the development of symptoms in asymptomatic at risk members participating in the Genetic Frontotemporal Dementia Initiative, including 19 patients, 64
A new learning algorithm for a fully connected neuro-fuzzy inference system.

Science.gov (United States)

Chen, C L Philip; Wang, Jing; Wang, Chi-Hsu; Chen, Long

2014-10-01

A traditional neuro-fuzzy system is transformed into an equivalent fully connected three layer neural network (NN), namely, the fully connected neuro-fuzzy inference systems (F-CONFIS). The F-CONFIS differs from traditional NNs by its dependent and repeated weights between input and hidden layers and can be considered as the variation of a kind of multilayer NN. Therefore, an efficient learning algorithm for the F-CONFIS to cope these repeated weights is derived. Furthermore, a dynamic learning rate is proposed for neuro-fuzzy systems via F-CONFIS where both premise (hidden) and consequent portions are considered. Several simulation results indicate that the proposed approach achieves much better accuracy and fast convergence.
A relative variation-based method to unraveling gene regulatory networks.

Directory of Open Access Journals (Sweden)

Yali Wang

Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be
Effects of Sample Size and Dimensionality on the Performance of Four Algorithms for Inference of Association Networks in Metabonomics

NARCIS (Netherlands)

Suarez Diez, M.; Saccenti, E.

2015-01-01

We investigated the effect of sample size and dimensionality on the performance of four algorithms (ARACNE, CLR, CORR, and PCLRC) when they are used for the inference of metabolite association networks. We report that as many as 100-400 samples may be necessary to obtain stable network estimations,
Isolation and Characterization of Two Lytic Bacteriophages, φSt2 and φGrn1; Phage Therapy Application for Biological Control of Vibrio alginolyticus in Aquaculture Live Feeds.

Directory of Open Access Journals (Sweden)

Panos G Kalatzis

Full Text Available Bacterial infections are a serious problem in aquaculture since they can result in massive mortalities in farmed fish and invertebrates. Vibriosis is one of the most common diseases in marine aquaculture hatcheries and its causative agents are bacteria of the genus Vibrio mostly entering larval rearing water through live feeds, such as Artemia and rotifers. The pathogenic Vibrio alginolyticus strain V1, isolated during a vibriosis outbreak in cultured seabream, Sparus aurata, was used as host to isolate and characterize the two novel bacteriophages φSt2 and φGrn1 for phage therapy application. In vitro cell lysis experiments were performed against the bacterial host V. alginolyticus strain V1 but also against 12 presumptive Vibrio strains originating from live prey Artemia salina cultures indicating the strong lytic efficacy of the 2 phages. In vivo administration of the phage cocktail, φSt2 and φGrn1, at MOI = 100 directly on live prey A. salina cultures, led to a 93% decrease of presumptive Vibrio population after 4 h of treatment. Current study suggests that administration of φSt2 and φGrn1 to live preys could selectively reduce Vibrio load in fish hatcheries. Innovative and environmental friendly solutions against bacterial diseases are more than necessary and phage therapy is one of them.
An Inference Language for Imaging

DEFF Research Database (Denmark)

Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen

2014-01-01

We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...
Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Takeshi Hase

Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.
OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks.

Science.gov (United States)

Lim, Néhémy; Senbabaoglu, Yasin; Michailidis, George; d'Alché-Buc, Florence

2013-06-01

Reverse engineering of gene regulatory networks remains a central challenge in computational systems biology, despite recent advances facilitated by benchmark in silico challenges that have aided in calibrating their performance. A number of approaches using either perturbation (knock-out) or wild-type time-series data have appeared in the literature addressing this problem, with the latter using linear temporal models. Nonlinear dynamical models are particularly appropriate for this inference task, given the generation mechanism of the time-series data. In this study, we introduce a novel nonlinear autoregressive model based on operator-valued kernels that simultaneously learns the model parameters, as well as the network structure. A flexible boosting algorithm (OKVAR-Boost) that shares features from L2-boosting and randomization-based algorithms is developed to perform the tasks of parameter learning and network inference for the proposed model. Specifically, at each boosting iteration, a regularized Operator-valued Kernel-based Vector AutoRegressive model (OKVAR) is trained on a random subnetwork. The final model consists of an ensemble of such models. The empirical estimation of the ensemble model's Jacobian matrix provides an estimation of the network structure. The performance of the proposed algorithm is first evaluated on a number of benchmark datasets from the DREAM3 challenge and then on real datasets related to the In vivo Reverse-Engineering and Modeling Assessment (IRMA) and T-cell networks. The high-quality results obtained strongly indicate that it outperforms existing approaches. The OKVAR-Boost Matlab code is available as the archive: http://amis-group.fr/sourcecode-okvar-boost/OKVARBoost-v1.0.zip. Supplementary data are available at Bioinformatics online.
An algebra-based method for inferring gene regulatory networks.

Science.gov (United States)

Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

2014-03-26

The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
Adaptive Inference on General Graphical Models

OpenAIRE

Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur

2012-01-01

Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...
An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

KAUST Repository

Bayer, Christian

2016-02-20

© 2016 Taylor & Francis Group, LLC. ABSTRACT: In this work, we present an extension of the forward–reverse representation introduced by Bayer and Schoenmakers (Annals of Applied Probability, 24(5):1994–2032, 2014) to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, that is, SRNs conditional on their values in the extremes of given time intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the expectation-maximization algorithm to the phase I output. By selecting a set of overdispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
An efficient forward-reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

KAUST Repository

Vilanova, Pedro

2016-01-07

In this work, we present an extension of the forward-reverse representation introduced in Simulation of forward-reverse stochastic representations for conditional diffusions , a 2014 paper by Bayer and Schoenmakers to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization algorithm to the phase I output. By selecting a set of over-dispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Inferring anatomical therapeutic chemical (ATC) class of drugs using shortest path and random walk with restart algorithms.

Science.gov (United States)

Chen, Lei; Liu, Tao; Zhao, Xian

2018-06-01

The anatomical therapeutic chemical (ATC) classification system is a widely accepted drug classification scheme. This system comprises five levels and includes several classes in each level. Drugs are classified into classes according to their therapeutic effects and characteristics. The first level includes 14 main classes. In this study, we proposed two network-based models to infer novel potential chemicals deemed to belong in the first level of ATC classification. To build these models, two large chemical networks were constructed using the chemical-chemical interaction information retrieved from the Search Tool for Interactions of Chemicals (STITCH). Two classic network algorithms, shortest path (SP) and random walk with restart (RWR) algorithms, were executed on the corresponding network to mine novel chemicals for each ATC class using the validated drugs in a class as seed nodes. Then, the obtained chemicals yielded by these two algorithms were further evaluated by a permutation test and an association test. The former can exclude chemicals produced by the structure of the network, i.e., false positive discoveries. By contrast, the latter identifies the most important chemicals that have strong associations with the ATC class. Comparisons indicated that the two models can provide quite dissimilar results, suggesting that the results yielded by one model can be essential supplements for those obtained by the other model. In addition, several representative inferred chemicals were analyzed to confirm the reliability of the results generated by the two models. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017 Elsevier B.V. All rights reserved.
Ensemble stacking mitigates biases in inference of synaptic connectivity.

Science.gov (United States)

Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N

2018-01-01

A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.
Intelligent PID controller based on ant system algorithm and fuzzy inference and its application to bionic artificial leg

Institute of Scientific and Technical Information of China (English)

谭冠政; 曾庆冬; 李文斌

2004-01-01

A designing method of intelligent proportional-integral-derivative(PID) controllers was proposed based on the ant system algorithm and fuzzy inference. This kind of controller is called Fuzzy-ant system PID controller. It consists of an off-line part and an on-line part. In the off-line part, for a given control system with a PID controller,by taking the overshoot, setting time and steady-state error of the system unit step response as the performance indexes and by using the ant system algorithm, a group of optimal PID parameters K*p , Ti* and T*d can be obtained, which are used as the initial values for the on-line tuning of PID parameters. In the on-line part, based on Kp* , Ti*and Td* and according to the current system error e and its time derivative, a specific program is written, which is used to optimize and adjust the PID parameters on-line through a fuzzy inference mechanism to ensure that the system response has optimal transient and steady-state performance. This kind of intelligent PID controller can be used to control the motor of the intelligent bionic artificial leg designed by the authors. The result of computer simulation experiment shows that the controller has less overshoot and shorter setting time.
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment.

Science.gov (United States)

Yao, Yao; Storme, Veronique; Marchal, Kathleen; Van de Peer, Yves

2016-01-01

We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population.
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment

Science.gov (United States)

Yao, Yao; Storme, Veronique; Marchal, Kathleen

2016-01-01

We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population. PMID:28028477
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment

Directory of Open Access Journals (Sweden)

Yao Yao

2016-12-01

Full Text Available We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population.
Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

Directory of Open Access Journals (Sweden)

Vipin Narang

Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.

Theoretic derivation of directed acyclic subgraph algorithm and comparisons with message passing algorithm

Science.gov (United States)

Ha, Jeongmok; Jeong, Hong

2016-07-01

This study investigates the directed acyclic subgraph (DAS) algorithm, which is used to solve discrete labeling problems much more rapidly than other Markov-random-field-based inference methods but at a competitive accuracy. However, the mechanism by which the DAS algorithm simultaneously achieves competitive accuracy and fast execution speed, has not been elucidated by a theoretical derivation. We analyze the DAS algorithm by comparing it with a message passing algorithm. Graphical models, inference methods, and energy-minimization frameworks are compared between DAS and message passing algorithms. Moreover, the performances of DAS and other message passing methods [sum-product belief propagation (BP), max-product BP, and tree-reweighted message passing] are experimentally compared.
Generating inferences from knowledge structures based on general automata

Energy Technology Data Exchange (ETDEWEB)

Koenig, E C

1983-01-01

The author shows that the model for knowledge structures for computers based on general automata accommodates procedures for establishing inferences. Algorithms are presented which generate inferences as output of a computer when its sentence input names appropriate knowledge elements contained in an associated knowledge structure already stored in the memory of the computer. The inferences are found to have either a single graph tuple or more than one graph tuple of associated knowledge. Six algorithms pertain to a single graph tuple and a seventh pertains to more than one graph tuple of associated knowledge. A named term is either the automaton, environment, auxiliary receptor, principal receptor, auxiliary effector, or principal effector. The algorithm pertaining to more than one graph tuple requires that the input sentence names the automaton, transformation response, and environment of one of the tuples of associated knowledge in a sequence of tuples. Interaction with the computer may be either in a conversation or examination mode. The algorithms are illustrated by an example. 13 references.
Efficient Exact Inference With Loss Augmented Objective in Structured Learning.

Science.gov (United States)

Bauer, Alexander; Nakajima, Shinichi; Muller, Klaus-Robert

2016-08-19

Structural support vector machine (SVM) is an elegant approach for building complex and accurate models with structured outputs. However, its applicability relies on the availability of efficient inference algorithms--the state-of-the-art training algorithms repeatedly perform inference to compute a subgradient or to find the most violating configuration. In this paper, we propose an exact inference algorithm for maximizing nondecomposable objectives due to special type of a high-order potential having a decomposable internal structure. As an important application, our method covers the loss augmented inference, which enables the slack and margin scaling formulations of structural SVM with a variety of dissimilarity measures, e.g., Hamming loss, precision and recall, Fβ-loss, intersection over union, and many other functions that can be efficiently computed from the contingency table. We demonstrate the advantages of our approach in natural language parsing and sequence segmentation applications.
Type Inference for Session Types in the Pi-Calculus

DEFF Research Database (Denmark)

Graversen, Eva Fajstrup; Harbo, Jacob Buchreitz; Huttel, Hans

2014-01-01

In this paper we present a direct algorithm for session type inference for the π-calculus. Type inference for session types has previously been achieved by either imposing limitations and restriction on the π-calculus, or by reducing the type inference problem to that for linear types. Our approach...
Inferring microRNA regulation of mRNA with partially ordered samples of paired expression data and exogenous prediction algorithms.

Directory of Open Access Journals (Sweden)

Brian Godsey

Full Text Available MicroRNAs (miRs are known to play an important role in mRNA regulation, often by binding to complementary sequences in "target" mRNAs. Recently, several methods have been developed by which existing sequence-based target predictions can be combined with miR and mRNA expression data to infer true miR-mRNA targeting relationships. It has been shown that the combination of these two approaches gives more reliable results than either by itself. While a few such algorithms give excellent results, none fully addresses expression data sets with a natural ordering of the samples. If the samples in an experiment can be ordered or partially ordered by their expected similarity to one another, such as for time-series or studies of development processes, stages, or types, (e.g. cell type, disease, growth, aging, there are unique opportunities to infer miR-mRNA interactions that may be specific to the underlying processes, and existing methods do not exploit this. We propose an algorithm which specifically addresses [partially] ordered expression data and takes advantage of sample similarities based on the ordering structure. This is done within a Bayesian framework which specifies posterior distributions and therefore statistical significance for each model parameter and latent variable. We apply our model to a previously published expression data set of paired miR and mRNA arrays in five partially ordered conditions, with biological replicates, related to multiple myeloma, and we show how considering potential orderings can improve the inference of miR-mRNA interactions, as measured by existing knowledge about the involved transcripts.
Lower complexity bounds for lifted inference

DEFF Research Database (Denmark)

Jaeger, Manfred

2015-01-01

instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...
Making Type Inference Practical

DEFF Research Database (Denmark)

Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens

1992-01-01

We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...
Inferring motion and location using WLAN RSSI

NARCIS (Netherlands)

Kavitha Muthukrishnan, K.; van der Zwaag, B.J.; Havinga, Paul J.M.; Fuller, R.; Koutsoukos, X.

2009-01-01

We present novel algorithms to infer movement by making use of inherent fluctuations in the received signal strengths from existing WLAN infrastructure. We evaluate the performance of the presented algorithms based on classification metrics such as recall and precision using annotated traces
A linear programming model for protein inference problem in shotgun proteomics.

Science.gov (United States)

Huang, Ting; He, Zengyou

2012-11-15

Assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is an important issue in shotgun proteomics. The objective of protein inference is to find a subset of proteins that are truly present in the sample. Although many methods have been proposed for protein inference, several issues such as peptide degeneracy still remain unsolved. In this article, we present a linear programming model for protein inference. In this model, we use a transformation of the joint probability that each peptide/protein pair is present in the sample as the variable. Then, both the peptide probability and protein probability can be expressed as a formula in terms of the linear combination of these variables. Based on this simple fact, the protein inference problem is formulated as an optimization problem: minimize the number of proteins with non-zero probabilities under the constraint that the difference between the calculated peptide probability and the peptide probability generated from peptide identification algorithms should be less than some threshold. This model addresses the peptide degeneracy issue by forcing some joint probability variables involving degenerate peptides to be zero in a rigorous manner. The corresponding inference algorithm is named as ProteinLP. We test the performance of ProteinLP on six datasets. Experimental results show that our method is competitive with the state-of-the-art protein inference algorithms. The source code of our algorithm is available at: https://sourceforge.net/projects/prolp/. zyhe@dlut.edu.cn. Supplementary data are available at Bioinformatics Online.
Progranulin plasma levels predict the presence of GRN mutations in asymptomatic subjects and do not correlate with brain atrophy: results from the GENFI study.

Science.gov (United States)

Galimberti, Daniela; Fumagalli, Giorgio G; Fenoglio, Chiara; Cioffi, Sara M G; Arighi, Andrea; Serpente, Maria; Borroni, Barbara; Padovani, Alessandro; Tagliavini, Fabrizio; Masellis, Mario; Tartaglia, Maria Carmela; van Swieten, John; Meeter, Lieke; Graff, Caroline; de Mendonça, Alexandre; Bocchetta, Martina; Rohrer, Jonathan D; Scarpini, Elio

2018-02-01

We investigated whether progranulin plasma levels are predictors of the presence of progranulin gene (GRN) null mutations or of the development of symptoms in asymptomatic at risk members participating in the Genetic Frontotemporal Dementia Initiative, including 19 patients, 64 asymptomatic carriers, and 77 noncarriers. In addition, we evaluated a possible role of TMEM106B rs1990622 as a genetic modifier and correlated progranulin plasma levels and gray-matter atrophy. Plasma progranulin mean ± SD plasma levels in patients and asymptomatic carriers were significantly decreased compared with noncarriers (30.5 ± 13.0 and 27.7 ± 7.5 versus 99.6 ± 24.8 ng/mL, p 61.55 ng/mL, the test had a sensitivity of 98.8% and a specificity of 97.5% in predicting the presence of a mutation, independent of symptoms. No correlations were found between progranulin plasma levels and age, years from average age at onset in each family, or TMEM106B rs1990622 genotype (p > 0.05). Plasma progranulin levels did not correlate with brain atrophy. Plasma progranulin levels predict the presence of GRN null mutations independent of proximity to symptoms and brain atrophy. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Mathematical inference and control of molecular networks from perturbation experiments

Science.gov (United States)

Mohammed-Rasheed, Mohammed

One of the main challenges facing biologists and mathematicians in the post genomic era is to understand the behavior of molecular networks and harness this understanding into an educated intervention of the cell. The cell maintains its function via an elaborate network of interconnecting positive and negative feedback loops of genes, RNA and proteins that send different signals to a large number of pathways and molecules. These structures are referred to as genetic regulatory networks (GRNs) or molecular networks. GRNs can be viewed as dynamical systems with inherent properties and mechanisms, such as steady-state equilibriums and stability, that determine the behavior of the cell. The biological relevance of the mathematical concepts are important as they may predict the differentiation of a stem cell, the maintenance of a normal cell, the development of cancer and its aberrant behavior, and the design of drugs and response to therapy. Uncovering the underlying GRN structure from gene/protein expression data, e.g., microarrays or perturbation experiments, is called inference or reverse engineering of the molecular network. Because of the high cost and time consuming nature of biological experiments, the number of available measurements or experiments is very small compared to the number of molecules (genes, RNA and proteins). In addition, the observations are noisy, where the noise is due to the measurements imperfections as well as the inherent stochasticity of genetic expression levels. Intra-cellular activities and extra-cellular environmental attributes are also another source of variability. Thus, the inference of GRNs is, in general, an under-determined problem with a highly noisy set of observations. The ultimate goal of GRN inference and analysis is to be able to intervene within the network, in order to force it away from undesirable cellular states and into desirable ones. However, it remains a major challenge to design optimal intervention strategies
Forecasting building energy consumption with hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system

Energy Technology Data Exchange (ETDEWEB)

Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)

2010-11-15

There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems.

Science.gov (United States)

Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M

2017-01-01

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
State-Space Inference and Learning with Gaussian Processes

OpenAIRE

Turner, R; Deisenroth, MP; Rasmussen, CE

2010-01-01

18.10.13 KB. Ok to add author version to spiral, authors hold copyright. State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model. C...
The NIFTY way of Bayesian signal inference

International Nuclear Information System (INIS)

Selig, Marco

2014-01-01

We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D 3 PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
The NIFTy way of Bayesian signal inference

Science.gov (United States)

Selig, Marco

2014-12-01

We introduce NIFTy, "Numerical Information Field Theory", a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTy can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTy as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy.
Bayesian Inference Methods for Sparse Channel Estimation

DEFF Research Database (Denmark)

Pedersen, Niels Lovmand

2013-01-01

This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Application of Bayesian inference to stochastic analytic continuation

International Nuclear Information System (INIS)

Fuchs, S; Pruschke, T; Jarrell, M

2010-01-01

We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data. The algorithm is strictly based on principles of Bayesian statistical inference. It utilizes Monte Carlo simulations to calculate a weighted average of possible energy spectra. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum entropy calculation.
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

Directory of Open Access Journals (Sweden)

Faridah Hani Mohamed Salleh

2017-01-01

Full Text Available Gene regulatory network (GRN reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C as a direct interaction (A → C. Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees.

Science.gov (United States)

Stolzer, Maureen; Lai, Han; Xu, Minli; Sathaye, Deepa; Vernot, Benjamin; Durand, Dannie

2012-09-15

Gene duplication (D), transfer (T), loss (L) and incomplete lineage sorting (I) are crucial to the evolution of gene families and the emergence of novel functions. The history of these events can be inferred via comparison of gene and species trees, a process called reconciliation, yet current reconciliation algorithms model only a subset of these evolutionary processes. We present an algorithm to reconcile a binary gene tree with a nonbinary species tree under a DTLI parsimony criterion. This is the first reconciliation algorithm to capture all four evolutionary processes driving tree incongruence and the first to reconcile non-binary species trees with a transfer model. Our algorithm infers all optimal solutions and reports complete, temporally feasible event histories, giving the gene and species lineages in which each event occurred. It is fixed-parameter tractable, with polytime complexity when the maximum species outdegree is fixed. Application of our algorithms to prokaryotic and eukaryotic data show that use of an incomplete event model has substantial impact on the events inferred and resulting biological conclusions. Our algorithms have been implemented in Notung, a freely available phylogenetic reconciliation software package, available at http://www.cs.cmu.edu/~durand/Notung. mstolzer@andrew.cmu.edu.

Likelihood-Based Inference of B Cell Clonal Families.

Directory of Open Access Journals (Sweden)

Duncan K Ralph

2016-10-01

Full Text Available The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called "rearrangement" forming progenitor B cells, then a Darwinian process of lineage diversification and selection called "affinity maturation." The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem "clonal family inference." In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.
A neuro-fuzzy inference system for sensor monitoring

International Nuclear Information System (INIS)

Na, Man Gyun

2001-01-01

A neuro-fuzzy inference system combined with the wavelet denoising, PCA (principal component analysis) and SPRT (sequential probability ratio test) methods has been developed to monitor the relevant sensor using the information of other sensors. The paramters of the neuro-fuzzy inference system which estimates the relevant sensor signal are optimized by a genetic algorithm and a least-squares algorithm. The wavelet denoising technique was applied to remove noise components in input signals into the neuro-fuzzy system. By reducing the dimension of an input space into the neuro-fuzzy system without losing a significant amount of information, the PCA was used to reduce the time necessary to train the neuro-fuzzy system, simplify the structure of the neuro-fuzzy inference system and also, make easy the selection of the input signals into the neuro-fuzzy system. By using the residual signals between the estimated signals and the measured signals, the SPRT is applied to detect whether the sensors are degraded or not. The proposed sensor-monitoring algorithm was verified through applications to the pressurizer water level, the pressurizer pressure, and the hot-leg temperature sensors in pressurized water reactors
SPEEDY: An Eclipse-based IDE for invariant inference

Directory of Open Access Journals (Sweden)

David R. Cok

2014-04-01

Full Text Available SPEEDY is an Eclipse-based IDE for exploring techniques that assist users in generating correct specifications, particularly including invariant inference algorithms and tools. It integrates with several back-end tools that propose invariants and will incorporate published algorithms for inferring object and loop invariants. Though the architecture is language-neutral, current SPEEDY targets C programs. Building and using SPEEDY has confirmed earlier experience demonstrating the importance of showing and editing specifications in the IDEs that developers customarily use, automating as much of the production and checking of specifications as possible, and showing counterexample information directly in the source code editing environment. As in previous work, automation of specification checking is provided by back-end SMT solvers. However, reducing the effort demanded of software developers using formal methods also requires a GUI design that guides users in writing, reviewing, and correcting specifications and automates specification inference.
Bayesian inference for spatio-temporal spike-and-slab priors

DEFF Research Database (Denmark)

Andersen, Michael Riis; Vehtari, Aki; Winther, Ole

2017-01-01

a transformed Gaussian process on the spike-and-slab probabilities. An expectation propagation (EP) algorithm for posterior inference under the proposed model is derived. For large scale problems, the standard EP algorithm can be prohibitively slow. We therefore introduce three different approximation schemes...
Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

Directory of Open Access Journals (Sweden)

Fei Xiao

Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Algorithms for Bayesian network modeling and reliability assessment of infrastructure systems

International Nuclear Information System (INIS)

Tien, Iris; Der Kiureghian, Armen

2016-01-01

Novel algorithms are developed to enable the modeling of large, complex infrastructure systems as Bayesian networks (BNs). These include a compression algorithm that significantly reduces the memory storage required to construct the BN model, and an updating algorithm that performs inference on compressed matrices. These algorithms address one of the major obstacles to widespread use of BNs for system reliability assessment, namely the exponentially increasing amount of information that needs to be stored as the number of components in the system increases. The proposed compression and inference algorithms are described and applied to example systems to investigate their performance compared to that of existing algorithms. Orders of magnitude savings in memory storage requirement are demonstrated using the new algorithms, enabling BN modeling and reliability analysis of larger infrastructure systems. - Highlights: • Novel algorithms developed for Bayesian network modeling of infrastructure systems. • Algorithm presented to compress information in conditional probability tables. • Updating algorithm presented to perform inference on compressed matrices. • Algorithms applied to example systems to investigate their performance. • Orders of magnitude savings in memory storage requirement demonstrated.
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

Directory of Open Access Journals (Sweden)

Xiaobo Guo

Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Grouping preprocess for haplotype inference from SNP and CNV data

International Nuclear Information System (INIS)

Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato; Kamatani, Naoyuki

2009-01-01

The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.
Grouping preprocess for haplotype inference from SNP and CNV data

Energy Technology Data Exchange (ETDEWEB)

Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Inoue, Masato [Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555 (Japan); Kamatani, Naoyuki, E-mail: masato.inoue@eb.waseda.ac.j [Institute of Rheumatology, Tokyo Women' s Medical University, 10-22, Kawada-cho, Shinjuku-ku, Tokyo 162-0054 (Japan)

2009-12-01

The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model

International Nuclear Information System (INIS)

Takaishi, Tetsuya

2015-01-01

The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Metis: A Pure Metropolis Markov Chain Monte Carlo Bayesian Inference Library

Energy Technology Data Exchange (ETDEWEB)

Bates, Cameron Russell [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Mckigney, Edward Allen [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2018-01-09

The use of Bayesian inference in data analysis has become the standard for large scienti c experiments [1, 2]. The Monte Carlo Codes Group(XCP-3) at Los Alamos has developed a simple set of algorithms currently implemented in C++ and Python to easily perform at-prior Markov Chain Monte Carlo Bayesian inference with pure Metropolis sampling. These implementations are designed to be user friendly and extensible for customization based on speci c application requirements. This document describes the algorithmic choices made and presents two use cases.
Personalized recommendation via unbalance full-connectivity inference

Science.gov (United States)

Ma, Wenping; Ren, Chen; Wu, Yue; Wang, Shanfeng; Feng, Xiang

2017-10-01

Recommender systems play an important role to help us to find useful information. They are widely used by most e-commerce web sites to push the potential items to individual user according to purchase history. Network-based recommendation algorithms are popular and effective in recommendation, which use two types of elements to represent users and items respectively. In this paper, based on consistence-based inference (CBI) algorithm, we propose a novel network-based algorithm, in which users and items are recognized with no difference. The proposed algorithm also uses information diffusion to find the relationship between users and items. Different from traditional network-based recommendation algorithms, information diffusion initializes from users and items, respectively. Experiments show that the proposed algorithm is effective compared with traditional network-based recommendation algorithms.
Recursive algorithms for phylogenetic tree counting.

Science.gov (United States)

Gavryushkina, Alexandra; Welch, David; Drummond, Alexei J

2013-10-28

In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.
A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium

Energy Technology Data Exchange (ETDEWEB)

Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.

2009-04-20

Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.
Classical methods for interpreting objective function minimization as intelligent inference

Energy Technology Data Exchange (ETDEWEB)

Golden, R.M. [Univ. of Texas, Dallas, TX (United States)

1996-12-31

Most recognition algorithms and neural networks can be formally viewed as seeking a minimum value of an appropriate objective function during either classification or learning phases. The goal of this paper is to argue that in order to show a recognition algorithm is making intelligent inferences, it is not sufficient to show that the recognition algorithm is computing (or trying to compute) the global minimum of some objective function. One must explicitly define a {open_quotes}relational system{close_quotes} for the recognition algorithm or neural network which identifies the: (i) sample space, (ii) the relevant sigmafield of events generated by the sample space, and (iii) the {open_quotes}relation{close_quotes} for that relational system. Only when such a {open_quotes}relational system{close_quotes} is properly defined, is it possible to formally establish the sense in which computing the global minimum of an objective function is an intelligent, inference.
Detection of Cheating by Decimation Algorithm

Science.gov (United States)

Yamanaka, Shogo; Ohzeki, Masayuki; Decelle, Aurélien

2015-02-01

We expand the item response theory to study the case of "cheating students" for a set of exams, trying to detect them by applying a greedy algorithm of inference. This extended model is closely related to the Boltzmann machine learning. In this paper we aim to infer the correct biases and interactions of our model by considering a relatively small number of sets of training data. Nevertheless, the greedy algorithm that we employed in the present study exhibits good performance with a few number of training data. The key point is the sparseness of the interactions in our problem in the context of the Boltzmann machine learning: the existence of cheating students is expected to be very rare (possibly even in real world). We compare a standard approach to infer the sparse interactions in the Boltzmann machine learning to our greedy algorithm and we find the latter to be superior in several aspects.
Optimal inverse magnetorheological damper modeling using shuffled frog-leaping algorithm–based adaptive neuro-fuzzy inference system approach

Directory of Open Access Journals (Sweden)

Xiufang Lin

2016-08-01

Full Text Available Magnetorheological dampers have become prominent semi-active control devices for vibration mitigation of structures which are subjected to severe loads. However, the damping force cannot be controlled directly due to the inherent nonlinear characteristics of the magnetorheological dampers. Therefore, for fully exploiting the capabilities of the magnetorheological dampers, one of the challenging aspects is to develop an accurate inverse model which can appropriately predict the input voltage to control the damping force. In this article, a hybrid modeling strategy combining shuffled frog-leaping algorithm and adaptive-network-based fuzzy inference system is proposed to model the inverse dynamic characteristics of the magnetorheological dampers for improving the modeling accuracy. The shuffled frog-leaping algorithm is employed to optimize the premise parameters of the adaptive-network-based fuzzy inference system while the consequent parameters are tuned by a least square estimation method, here known as shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach. To evaluate the effectiveness of the proposed approach, the inverse modeling results based on the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach are compared with those based on the adaptive-network-based fuzzy inference system and genetic algorithm–based adaptive-network-based fuzzy inference system approaches. Analysis of variance test is carried out to statistically compare the performance of the proposed methods and the results demonstrate that the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system strategy outperforms the other two methods in terms of modeling (training accuracy and checking accuracy.
Inferring the conservative causal core of gene regulatory networks

Directory of Open Access Journals (Sweden)

Emmert-Streib Frank

2010-09-01

Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Inferring the conservative causal core of gene regulatory networks.

Science.gov (United States)

Altay, Gökmen; Emmert-Streib, Frank

2010-09-28

Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
General Purpose Probabilistic Programming Platform with Effective Stochastic Inference

Science.gov (United States)

2018-04-01

REFERENCES 74 LIST OF ACRONYMS 80 ii List of Figures Figure 1. The problem of inferring curves from data while simultaneously choosing the...bottom path) as the inverse problem to computer graphics (top path). ........ 18 Figure 18. An illustration of generative probabilistic graphics for 3D...Building these systems involves simultaneously developing mathematical models, inference algorithms and optimized software implementations. Small changes

Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.

Science.gov (United States)

Wu, Yufeng

2012-03-01

Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
Measurement of Stock Market Liquidity Supported By an Algorithm Inferring the Initiator of a Trade

Directory of Open Access Journals (Sweden)

Joanna Olbryś

2017-01-01

Full Text Available The aim of this study is to assess and analyse selected liquidity/illiquidity measures derived from high-frequency intraday data from the Warsaw Stock Exchange (WSE. As the side initiating a trade cannot be directly identified from a raw data set, firstly the Lee-Ready algorithm for inferring the initiator of a trade is employed to distinguish between so-called buyer- and seller-initiated trades. Intraday data for fifty-three WSE-listed companies divided into three size groups cover the period from January 3, 2005 to June 30, 2015. The paper provides an analysis of the robustness of the obtained results with respect to the whole sample and three consecutive subsamples, each of equal size: covering the precrisis, crisis, and post-crisis periods. The empirical results turn out to be robust to the choice of the period. Furthermore, hypotheses concerning the statistical significance of coefficients of correlation between the daily values of three liquidity proxies used in the study are tested. (original abstract
Predictive minimum description length principle approach to inferring gene regulatory networks.

Science.gov (United States)

Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping

2011-01-01

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
Universal Darwinism As a Process of Bayesian Inference.

Science.gov (United States)

Campbell, John O

2016-01-01

Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Fuzzy logic controller using different inference methods

International Nuclear Information System (INIS)

Liu, Z.; De Keyser, R.

1994-01-01

In this paper the design of fuzzy controllers by using different inference methods is introduced. Configuration of the fuzzy controllers includes a general rule-base which is a collection of fuzzy PI or PD rules, the triangular fuzzy data model and a centre of gravity defuzzification algorithm. The generalized modus ponens (GMP) is used with the minimum operator of the triangular norm. Under the sup-min inference rule, six fuzzy implication operators are employed to calculate the fuzzy look-up tables for each rule base. The performance is tested in simulated systems with MATLAB/SIMULINK. Results show the effects of using the fuzzy controllers with different inference methods and applied to different test processes
A new and accurate fault location algorithm for combined transmission lines using Adaptive Network-Based Fuzzy Inference System

Energy Technology Data Exchange (ETDEWEB)

Sadeh, Javad; Afradi, Hamid [Electrical Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, P.O. Box: 91775-1111, Mashhad (Iran)

2009-11-15

This paper presents a new and accurate algorithm for locating faults in a combined overhead transmission line with underground power cable using Adaptive Network-Based Fuzzy Inference System (ANFIS). The proposed method uses 10 ANFIS networks and consists of 3 stages, including fault type classification, faulty section detection and exact fault location. In the first part, an ANFIS is used to determine the fault type, applying four inputs, i.e., fundamental component of three phase currents and zero sequence current. Another ANFIS network is used to detect the faulty section, whether the fault is on the overhead line or on the underground cable. Other eight ANFIS networks are utilized to pinpoint the faults (two for each fault type). Four inputs, i.e., the dc component of the current, fundamental frequency of the voltage and current and the angle between them, are used to train the neuro-fuzzy inference systems in order to accurately locate the faults on each part of the combined line. The proposed method is evaluated under different fault conditions such as different fault locations, different fault inception angles and different fault resistances. Simulation results confirm that the proposed method can be used as an efficient means for accurate fault location on the combined transmission lines. (author)
Design of uav robust autopilot based on adaptive neuro-fuzzy inference system

Directory of Open Access Journals (Sweden)

Mohand Achour Touat

2008-04-01

Full Text Available This paper is devoted to the application of adaptive neuro-fuzzy inference systems to the robust control of the UAV longitudinal motion. The adaptive neore-fuzzy inference system model needs to be trained by input/output data. This data were obtained from the modeling of a ”crisp” robust control system. The synthesis of this system is based on the separation theorem, which defines the structure and parameters of LQG-optimal controller, and further - robust optimization of this controller, based on the genetic algorithm. Such design procedure can define the rule base and parameters of fuzzyfication and defuzzyfication algorithms of the adaptive neore-fuzzy inference system controller, which ensure the robust properties of the control system. Simulation of the closed loop control system of UAV longitudinal motion with adaptive neore-fuzzy inference system controller demonstrates high efficiency of proposed design procedure.
Poisson-Based Inference for Perturbation Models in Adaptive Spelling Training

Science.gov (United States)

Baschera, Gian-Marco; Gross, Markus

2010-01-01

We present an inference algorithm for perturbation models based on Poisson regression. The algorithm is designed to handle unclassified input with multiple errors described by independent mal-rules. This knowledge representation provides an intelligent tutoring system with local and global information about a student, such as error classification…
Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks

Directory of Open Access Journals (Sweden)

Ji Wei

2010-10-01

Full Text Available Abstract Background Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discretization has been conducted, in the context of gene regulatory network inference from time series gene expression data. Results In this study, we propose a new discretization method "bikmeans", and compare its performance with four other widely-used discretization methods using different datasets, modeling algorithms and number of intervals. Sensitivities, specificities and total accuracies were calculated and statistical analysis was carried out. Bikmeans method always gave high total accuracies. Conclusions Our results indicate that proper discretization methods can consistently improve gene regulatory network inference independent of network modeling algorithms and datasets. Our new method, bikmeans, resulted in significant better total accuracies than other methods.
MATRIX-VECTOR ALGORITHMS OF LOCAL POSTERIORI INFERENCE IN ALGEBRAIC BAYESIAN NETWORKS ON QUANTA PROPOSITIONS

Directory of Open Access Journals (Sweden)

A. A. Zolotin

2015-07-01

Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Universal Darwinism as a process of Bayesian inference

Directory of Open Access Journals (Sweden)

John Oberon Campbell

2016-06-01

Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation

DEFF Research Database (Denmark)

Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri

2013-01-01

representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Inferring Stop-Locations from WiFi

DEFF Research Database (Denmark)

Wind, David Kofoed; Sapiezynski, Piotr; Furman, Magdalena Anna

2016-01-01

methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select...
New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data.

Science.gov (United States)

Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S

2017-04-01

Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite
Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.

Directory of Open Access Journals (Sweden)

Xiaodong Cai

Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Using MOEA with Redistribution and Consensus Branches to Infer Phylogenies.

Science.gov (United States)

Min, Xiaoping; Zhang, Mouzhao; Yuan, Sisi; Ge, Shengxiang; Liu, Xiangrong; Zeng, Xiangxiang; Xia, Ningshao

2017-12-26

In recent years, to infer phylogenies, which are NP-hard problems, more and more research has focused on using metaheuristics. Maximum Parsimony and Maximum Likelihood are two effective ways to conduct inference. Based on these methods, which can also be considered as the optimal criteria for phylogenies, various kinds of multi-objective metaheuristics have been used to reconstruct phylogenies. However, combining these two time-consuming methods results in those multi-objective metaheuristics being slower than a single objective. Therefore, we propose a novel, multi-objective optimization algorithm, MOEA-RC, to accelerate the processes of rebuilding phylogenies using structural information of elites in current populations. We compare MOEA-RC with two representative multi-objective algorithms, MOEA/D and NAGA-II, and a non-consensus version of MOEA-RC on three real-world datasets. The result is, within a given number of iterations, MOEA-RC achieves better solutions than the other algorithms.
Causal Inference and Explaining Away in a Spiking Network

Science.gov (United States)

Moreno-Bote, Rubén; Drugowitsch, Jan

2015-01-01

While the brain uses spiking neurons for communication, theoretical research on brain computations has mostly focused on non-spiking networks. The nature of spike-based algorithms that achieve complex computations, such as object probabilistic inference, is largely unknown. Here we demonstrate that a family of high-dimensional quadratic optimization problems with non-negativity constraints can be solved exactly and efficiently by a network of spiking neurons. The network naturally imposes the non-negativity of causal contributions that is fundamental to causal inference, and uses simple operations, such as linear synapses with realistic time constants, and neural spike generation and reset non-linearities. The network infers the set of most likely causes from an observation using explaining away, which is dynamically implemented by spike-based, tuned inhibition. The algorithm performs remarkably well even when the network intrinsically generates variable spike trains, the timing of spikes is scrambled by external sources of noise, or the network is mistuned. This type of network might underlie tasks such as odor identification and classification. PMID:26621426
BagReg: Protein inference through machine learning.

Science.gov (United States)

Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

2015-08-01

Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Visual recognition and inference using dynamic overcomplete sparse learning.

Science.gov (United States)

Murray, Joseph F; Kreutz-Delgado, Kenneth

2007-09-01

We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
An efficient forward–reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks

KAUST Repository

Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

2016-01-01

then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve

Accelerating inference for diffusions observed with measurement error and large sample sizes using approximate Bayesian computation

DEFF Research Database (Denmark)

Picchini, Umberto; Forman, Julie Lyng

2016-01-01

a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An approximate Bayesian computation (ABC)-MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm......In recent years, dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However, it is often computationally unfeasible to apply exact statistical methodologies in the context of large data sets and complex models. This paper considers...... applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general...
Scalable inference for stochastic block models

KAUST Repository

Peng, Chengbin

2017-12-08

Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Graphical models for inferring single molecule dynamics

Directory of Open Access Journals (Sweden)

Gonzalez Ruben L

2010-10-01

Full Text Available Abstract Background The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM. The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM with Gaussian observables. A detailed description of smFRET is provided as well. Results The VBEM algorithm returns the model’s evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME, and the latter a description of the model’s parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML optimized by the expectation maximization (EM algorithm, the most important being a natural form of model selection and a well-posed (non-divergent optimization problem. Conclusions The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.
An analysis pipeline for the inference of protein-protein interaction networks

Energy Technology Data Exchange (ETDEWEB)

Taylor, Ronald C.; Singhal, Mudita; Daly, Don S.; Gilmore, Jason M.; Cannon, William R.; Domico, Kelly O.; White, Amanda M.; Auberry, Deanna L.; Auberry, Kenneth J.; Hooker, Brian S.; Hurst, G. B.; McDermott, Jason E.; McDonald, W. H.; Pelletier, Dale A.; Schmoyer, Denise A.; Wiley, H. S.

2009-12-01

An analysis pipeline has been created for deployment of a novel algorithm, the Bayesian Estimator of Protein-Protein Association Probabilities (BEPro), for use in the reconstruction of protein-protein interaction networks. We have combined the Software Environment for BIological Network Inference (SEBINI), an interactive environment for the deployment and testing of network inference algorithms that use high-throughput data, and the Collective Analysis of Biological Interaction Networks (CABIN), software that allows integration and analysis of protein-protein interaction and gene-to-gene regulatory evidence obtained from multiple sources, to allow interactions computed by BEPro to be stored, visualized, and further analyzed. Incorporating BEPro into SEBINI and automatically feeding the resulting inferred network into CABIN, we have created a structured workflow for protein-protein network inference and supplemental analysis from sets of mass spectrometry bait-prey experiment data. SEBINI demo site: https://www.emsl.pnl.gov /SEBINI/ Contact: ronald.taylor@pnl.gov. BEPro is available at http://www.pnl.gov/statistics/BEPro3/index.htm. Contact: ds.daly@pnl.gov. CABIN is available at http://www.sysbio.org/dataresources/cabin.stm. Contact: mudita.singhal@pnl.gov.
Fused Regression for Multi-source Gene Regulatory Network Inference.

Directory of Open Access Journals (Sweden)

Kari Y Lam

2016-12-01

Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.
Comparison of Machine Learning Techniques in Inferring Phytoplankton Size Classes

Directory of Open Access Journals (Sweden)

Shuibo Hu

2018-03-01

Full Text Available The size of phytoplankton not only influences its physiology, metabolic rates and marine food web, but also serves as an indicator of phytoplankton functional roles in ecological and biogeochemical processes. Therefore, some algorithms have been developed to infer the synoptic distribution of phytoplankton cell size, denoted as phytoplankton size classes (PSCs, in surface ocean waters, by the means of remotely sensed variables. This study, using the NASA bio-Optical Marine Algorithm Data set (NOMAD high performance liquid chromatography (HPLC database, and satellite match-ups, aimed to compare the effectiveness of modeling techniques, including partial least square (PLS, artificial neural networks (ANN, support vector machine (SVM and random forests (RF, and feature selection techniques, including genetic algorithm (GA, successive projection algorithm (SPA and recursive feature elimination based on support vector machine (SVM-RFE, for inferring PSCs from remote sensing data. Results showed that: (1 SVM-RFE worked better in selecting sensitive features; (2 RF performed better than PLS, ANN and SVM in calibrating PSCs retrieval models; (3 machine learning techniques produced better performance than the chlorophyll-a based three-component method; (4 sea surface temperature, wind stress, and spectral curvature derived from the remote sensing reflectance at 490, 510, and 555 nm were among the most sensitive features to PSCs; and (5 the combination of SVM-RFE feature selection techniques and random forests regression was recommended for inferring PSCs. This study demonstrated the effectiveness of machine learning techniques in selecting sensitive features and calibrating models for PSCs estimations with remote sensing.
Causal inference in biology networks with integrated belief propagation.

Science.gov (United States)

Chang, Rui; Karr, Jonathan R; Schadt, Eric E

2015-01-01

Inferring causal relationships among molecular and higher order phenotypes is a critical step in elucidating the complexity of living systems. Here we propose a novel method for inferring causality that is no longer constrained by the conditional dependency arguments that limit the ability of statistical causal inference methods to resolve causal relationships within sets of graphical models that are Markov equivalent. Our method utilizes Bayesian belief propagation to infer the responses of perturbation events on molecular traits given a hypothesized graph structure. A distance measure between the inferred response distribution and the observed data is defined to assess the 'fitness' of the hypothesized causal relationships. To test our algorithm, we infer causal relationships within equivalence classes of gene networks in which the form of the functional interactions that are possible are assumed to be nonlinear, given synthetic microarray and RNA sequencing data. We also apply our method to infer causality in real metabolic network with v-structure and feedback loop. We show that our method can recapitulate the causal structure and recover the feedback loop only from steady-state data which conventional method cannot.
Subjective randomness as statistical inference.

Science.gov (United States)

Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B

2018-06-01

Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
Structural influence of gene networks on their inference: analysis of C3NET

Directory of Open Access Journals (Sweden)

Emmert-Streib Frank

2011-06-01

Full Text Available Abstract Background The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. Results In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. Conclusions The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods. Reviewers This article was reviewed by Lev Klebanov, Joel Bader and Yuriy Gusev.
Inferring Stop-Locations from WiFi.

Directory of Open Access Journals (Sweden)

David Kofoed Wind

Full Text Available Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring 'stop locations' (places of stationarity from GPS or CDR data, or on detection of state (static/active. In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants' GPS data as well as ground truth data collected during a two month period.
Fisher information and statistical inference for phase-type distributions

DEFF Research Database (Denmark)

Bladt, Mogens; Esparza, Luz Judith R; Nielsen, Bo Friis

2011-01-01

This paper is concerned with statistical inference for both continuous and discrete phase-type distributions. We consider maximum likelihood estimation, where traditionally the expectation-maximization (EM) algorithm has been employed. Certain numerical aspects of this method are revised and we...
Implementing and analyzing the multi-threaded LP-inference

Science.gov (United States)

Bolotova, S. Yu; Trofimenko, E. V.; Leschinskaya, M. V.

2018-03-01

The logical production equations provide new possibilities for the backward inference optimization in intelligent production-type systems. The strategy of a relevant backward inference is aimed at minimization of a number of queries to external information source (either to a database or an interactive user). The idea of the method is based on the computing of initial preimages set and searching for the true preimage. The execution of each stage can be organized independently and in parallel and the actual work at a given stage can also be distributed between parallel computers. This paper is devoted to the parallel algorithms of the relevant inference based on the advanced scheme of the parallel computations “pipeline” which allows to increase the degree of parallelism. The author also provides some details of the LP-structures implementation.
Optimization of Indoor Thermal Comfort Parameters with the Adaptive Network-Based Fuzzy Inference System and Particle Swarm Optimization Algorithm

Directory of Open Access Journals (Sweden)

Jing Li

2017-01-01

Full Text Available The goal of this study is to improve thermal comfort and indoor air quality with the adaptive network-based fuzzy inference system (ANFIS model and improved particle swarm optimization (PSO algorithm. A method to optimize air conditioning parameters and installation distance is proposed. The methodology is demonstrated through a prototype case, which corresponds to a typical laboratory in colleges and universities. A laboratory model is established, and simulated flow field information is obtained with the CFD software. Subsequently, the ANFIS model is employed instead of the CFD model to predict indoor flow parameters, and the CFD database is utilized to train ANN input-output “metamodels” for the subsequent optimization. With the improved PSO algorithm and the stratified sequence method, the objective functions are optimized. The functions comprise PMV, PPD, and mean age of air. The optimal installation distance is determined with the hemisphere model. Results show that most of the staff obtain a satisfactory degree of thermal comfort and that the proposed method can significantly reduce the cost of building an experimental device. The proposed methodology can be used to determine appropriate air supply parameters and air conditioner installation position for a pleasant and healthy indoor environment.
A curious robot: An explorative-exploitive inference algorithm

DEFF Research Database (Denmark)

Pedersen, Kim Steenstrup; Johansen, Peter

2007-01-01

We propose a sequential learning algorithm with a focus on robot control. It is initialised by a teacher who directs the robot through a series of example solutions of a problem. Left alone, the control chooses its next action by prediction based on a variable order Markov chain model selected to...
Reinforcement and inference in cross-situational word learning.

Science.gov (United States)

Tilles, Paulo F C; Fontanari, José F

2013-01-01

Cross-situational word learning is based on the notion that a learner can determine the referent of a word by finding something in common across many observed uses of that word. Here we propose an adaptive learning algorithm that contains a parameter that controls the strength of the reinforcement applied to associations between concurrent words and referents, and a parameter that regulates inference, which includes built-in biases, such as mutual exclusivity, and information of past learning events. By adjusting these parameters so that the model predictions agree with data from representative experiments on cross-situational word learning, we were able to explain the learning strategies adopted by the participants of those experiments in terms of a trade-off between reinforcement and inference. These strategies can vary wildly depending on the conditions of the experiments. For instance, for fast mapping experiments (i.e., the correct referent could, in principle, be inferred in a single observation) inference is prevalent, whereas for segregated contextual diversity experiments (i.e., the referents are separated in groups and are exhibited with members of their groups only) reinforcement is predominant. Other experiments are explained with more balanced doses of reinforcement and inference.
Cytoprophet: a Cytoscape plug-in for protein and domain interaction networks inference.

Science.gov (United States)

Morcos, Faruck; Lamanna, Charles; Sikora, Marcin; Izaguirre, Jesús

2008-10-01

Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. http://cytoprophet.cse.nd.edu.
Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides

Directory of Open Access Journals (Sweden)

Wojciech Wieczorek

2016-01-01

Full Text Available The present paper is a novel contribution to the field of bioinformatics by using grammatical inference in the analysis of data. We developed an algorithm for generating star-free regular expressions which turned out to be good recommendation tools, as they are characterized by a relatively high correlation coefficient between the observed and predicted binary classifications. The experiments have been performed for three datasets of amyloidogenic hexapeptides, and our results are compared with those obtained using the graph approaches, the current state-of-the-art methods in heuristic automata induction, and the support vector machine. The results showed the superior performance of the new grammatical inference algorithm on fixed-length amyloid datasets.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR

International Nuclear Information System (INIS)

Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin

2015-01-01

Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics
Inverse Ising inference with correlated samples

International Nuclear Information System (INIS)

Obermayer, Benedikt; Levine, Erel

2014-01-01

Correlations between two variables of a high-dimensional system can be indicative of an underlying interaction, but can also result from indirect effects. Inverse Ising inference is a method to distinguish one from the other. Essentially, the parameters of the least constrained statistical model are learned from the observed correlations such that direct interactions can be separated from indirect correlations. Among many other applications, this approach has been helpful for protein structure prediction, because residues which interact in the 3D structure often show correlated substitutions in a multiple sequence alignment. In this context, samples used for inference are not independent but share an evolutionary history on a phylogenetic tree. Here, we discuss the effects of correlations between samples on global inference. Such correlations could arise due to phylogeny but also via other slow dynamical processes. We present a simple analytical model to address the resulting inference biases, and develop an exact method accounting for background correlations in alignment data by combining phylogenetic modeling with an adaptive cluster expansion algorithm. We find that popular reweighting schemes are only marginally effective at removing phylogenetic bias, suggest a rescaling strategy that yields better results, and provide evidence that our conclusions carry over to the frequently used mean-field approach to the inverse Ising problem. (paper)
Phase inductance estimation for switched reluctance motor using adaptive neuro-fuzzy inference system

International Nuclear Information System (INIS)

Daldaban, Ferhat; Ustkoyuncu, Nurettin; Guney, Kerim

2006-01-01

A new method based on an adaptive neuro-fuzzy inference system (ANFIS) for estimating the phase inductance of switched reluctance motors (SRMs) is presented. The ANFIS has the advantages of expert knowledge of the fuzzy inference system and the learning capability of neural networks. A hybrid learning algorithm, which combines the least square method and the back propagation algorithm, is used to identify the parameters of the ANFIS. The rotor position and the phase current of the 6/4 pole SRM are used to predict the phase inductance. The phase inductance results predicted by the ANFIS are in excellent agreement with the results of the finite element method

NIFTY - Numerical Information Field Theory. A versatile PYTHON library for signal inference

Science.gov (United States)

Selig, M.; Bell, M. R.; Junklewitz, H.; Oppermann, N.; Reinecke, M.; Greiner, M.; Pachajoa, C.; Enßlin, T. A.

2013-06-01

NIFTy (Numerical Information Field Theory) is a software package designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency. NIFTy offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on fields into classes. Thereby, the correct normalization of operations on fields is taken care of automatically without concerning the user. This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory. Thus, NIFTy permits its user to rapidly prototype algorithms in 1D, and then apply the developed code in higher-dimensional settings of real world problems. The set of spaces on which NIFTy operates comprises point sets, n-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those. The functionality and diversity of the package is demonstrated by a Wiener filter code example that successfully runs without modification regardless of the space on which the inference problem is defined. NIFTy homepage http://www.mpa-garching.mpg.de/ift/nifty/; Excerpts of this paper are part of the NIFTy source code and documentation.
Bayesian inference from count data using discrete uniform priors.

Directory of Open Access Journals (Sweden)

Federico Comoglio

Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Cycle-Based Cluster Variational Method for Direct and Inverse Inference

Science.gov (United States)

Furtlehner, Cyril; Decelle, Aurélien

2016-08-01

Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.
Fast Inference with Min-Sum Matrix Product.

Science.gov (United States)

Felzenszwalb, Pedro F; McAuley, Julian J

2011-12-01

The MAP inference problem in many graphical models can be solved efficiently using a fast algorithm for computing min-sum products of n × n matrices. The class of models in question includes cyclic and skip-chain models that arise in many applications. Although the worst-case complexity of the min-sum product operation is not known to be much better than O(n(3)), an O(n(2.5)) expected time algorithm was recently given, subject to some constraints on the input matrices. In this paper, we give an algorithm that runs in O(n(2) log n) expected time, assuming that the entries in the input matrices are independent samples from a uniform distribution. We also show that two variants of our algorithm are quite fast for inputs that arise in several applications. This leads to significant performance gains over previous methods in applications within computer vision and natural language processing.
Intelligent Modeling Combining Adaptive Neuro Fuzzy Inference System and Genetic Algorithm for Optimizing Welding Process Parameters

Science.gov (United States)

Gowtham, K. N.; Vasudevan, M.; Maduraimuthu, V.; Jayakumar, T.

2011-04-01

Modified 9Cr-1Mo ferritic steel is used as a structural material for steam generator components of power plants. Generally, tungsten inert gas (TIG) welding is preferred for welding of these steels in which the depth of penetration achievable during autogenous welding is limited. Therefore, activated flux TIG (A-TIG) welding, a novel welding technique, has been developed in-house to increase the depth of penetration. In modified 9Cr-1Mo steel joints produced by the A-TIG welding process, weld bead width, depth of penetration, and heat-affected zone (HAZ) width play an important role in determining the mechanical properties as well as the performance of the weld joints during service. To obtain the desired weld bead geometry and HAZ width, it becomes important to set the welding process parameters. In this work, adaptative neuro fuzzy inference system is used to develop independent models correlating the welding process parameters like current, voltage, and torch speed with weld bead shape parameters like depth of penetration, bead width, and HAZ width. Then a genetic algorithm is employed to determine the optimum A-TIG welding process parameters to obtain the desired weld bead shape parameters and HAZ width.
Inference-Based Surface Reconstruction of Cluttered Environments

KAUST Repository

Biggers, K.

2012-08-01

We present an inference-based surface reconstruction algorithm that is capable of identifying objects of interest among a cluttered scene, and reconstructing solid model representations even in the presence of occluded surfaces. Our proposed approach incorporates a predictive modeling framework that uses a set of user-provided models for prior knowledge, and applies this knowledge to the iterative identification and construction process. Our approach uses a local to global construction process guided by rules for fitting high-quality surface patches obtained from these prior models. We demonstrate the application of this algorithm on several example data sets containing heavy clutter and occlusion. © 2012 IEEE.
A grammar inference approach for predicting kinase specific phosphorylation sites.

Science.gov (United States)

Datta, Sutapa; Mukhopadhyay, Subhasis

2015-01-01

Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner.
A Grammar Inference Approach for Predicting Kinase Specific Phosphorylation Sites

Science.gov (United States)

Datta, Sutapa; Mukhopadhyay, Subhasis

2015-01-01

Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner. PMID:25886273
Approximation Methods for Inference and Learning in Belief Networks: Progress and Future Directions

National Research Council Canada - National Science Library

Pazzan, Michael

1997-01-01

.... In this research project, we have investigated methods and implemented algorithms for efficiently making certain classes of inference in belief networks, and for automatically learning certain...
Sequential Uniformly Reweighted Sum-Product Algorithm for Cooperative Localization in Wireless Networks

OpenAIRE

Li, Wei; Yang, Zhen; Hu, Haifeng

2014-01-01

Graphical models have been widely applied in solving distributed inference problems in wireless networks. In this paper, we formulate the cooperative localization problem in a mobile network as an inference problem on a factor graph. Using a sequential schedule of message updates, a sequential uniformly reweighted sum-product algorithm (SURW-SPA) is developed for mobile localization problems. The proposed algorithm combines the distributed nature of belief propagation (BP) with the improved p...
Impact of noise on molecular network inference.

Directory of Open Access Journals (Sweden)

Radhakrishnan Nagarajan

Full Text Available Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a common-effect, (b three-chain, and (c coherent type-I feed-forward loop (FFL are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS and Incremental Association Markov Blanket (IAMB that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the
A retrodictive stochastic simulation algorithm

International Nuclear Information System (INIS)

Vaughan, T.G.; Drummond, P.D.; Drummond, A.J.

2010-01-01

In this paper we describe a simple method for inferring the initial states of systems evolving stochastically according to master equations, given knowledge of the final states. This is achieved through the use of a retrodictive stochastic simulation algorithm which complements the usual predictive stochastic simulation approach. We demonstrate the utility of this new algorithm by applying it to example problems, including the derivation of likely ancestral states of a gene sequence given a Markovian model of genetic mutation.
Inference-based procedural modeling of solids

KAUST Repository

Biggers, Keith

2011-11-01

As virtual environments become larger and more complex, there is an increasing need for more automated construction algorithms to support the development process. We present an approach for modeling solids by combining prior examples with a simple sketch. Our algorithm uses an inference-based approach to incrementally fit patches together in a consistent fashion to define the boundary of an object. This algorithm samples and extracts surface patches from input models, and develops a Petri net structure that describes the relationship between patches along an imposed parameterization. Then, given a new parameterized line or curve, we use the Petri net to logically fit patches together in a manner consistent with the input model. This allows us to easily construct objects of varying sizes and configurations using arbitrary articulation, repetition, and interchanging of parts. The result of our process is a solid model representation of the constructed object that can be integrated into a simulation-based environment. © 2011 Elsevier Ltd. All rights reserved.
The Probabilistic Convolution Tree: Efficient Exact Bayesian Inference for Faster LC-MS/MS Protein Inference

Science.gov (United States)

Serang, Oliver

2014-01-01

Exact Bayesian inference can sometimes be performed efficiently for special cases where a function has commutative and associative symmetry of its inputs (called “causal independence”). For this reason, it is desirable to exploit such symmetry on big data sets. Here we present a method to exploit a general form of this symmetry on probabilistic adder nodes by transforming those probabilistic adder nodes into a probabilistic convolution tree with which dynamic programming computes exact probabilities. A substantial speedup is demonstrated using an illustration example that can arise when identifying splice forms with bottom-up mass spectrometry-based proteomics. On this example, even state-of-the-art exact inference algorithms require a runtime more than exponential in the number of splice forms considered. By using the probabilistic convolution tree, we reduce the runtime to and the space to where is the number of variables joined by an additive or cardinal operator. This approach, which can also be used with junction tree inference, is applicable to graphs with arbitrary dependency on counting variables or cardinalities and can be used on diverse problems and fields like forward error correcting codes, elemental decomposition, and spectral demixing. The approach also trivially generalizes to multiple dimensions. PMID:24626234
Fast algorithms for computing phylogenetic divergence time.

Science.gov (United States)

Crosby, Ralph W; Williams, Tiffani L

2017-12-06

The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.
Methodology for the inference of gene function from phenotype data.

Science.gov (United States)

Ascensao, Joao A; Dolan, Mary E; Hill, David P; Blake, Judith A

2014-12-12

Biomedical ontologies are increasingly instrumental in the advancement of biological research primarily through their use to efficiently consolidate large amounts of data into structured, accessible sets. However, ontology development and usage can be hampered by the segregation of knowledge by domain that occurs due to independent development and use of the ontologies. The ability to infer data associated with one ontology to data associated with another ontology would prove useful in expanding information content and scope. We here focus on relating two ontologies: the Gene Ontology (GO), which encodes canonical gene function, and the Mammalian Phenotype Ontology (MP), which describes non-canonical phenotypes, using statistical methods to suggest GO functional annotations from existing MP phenotype annotations. This work is in contrast to previous studies that have focused on inferring gene function from phenotype primarily through lexical or semantic similarity measures. We have designed and tested a set of algorithms that represents a novel methodology to define rules for predicting gene function by examining the emergent structure and relationships between the gene functions and phenotypes rather than inspecting the terms semantically. The algorithms inspect relationships among multiple phenotype terms to deduce if there are cases where they all arise from a single gene function. We apply this methodology to data about genes in the laboratory mouse that are formally represented in the Mouse Genome Informatics (MGI) resource. From the data, 7444 rule instances were generated from five generalized rules, resulting in 4818 unique GO functional predictions for 1796 genes. We show that our method is capable of inferring high-quality functional annotations from curated phenotype data. As well as creating inferred annotations, our method has the potential to allow for the elucidation of unforeseen, biologically significant associations between gene function and
Analytic continuation of quantum Monte Carlo data by stochastic analytical inference.

Science.gov (United States)

Fuchs, Sebastian; Pruschke, Thomas; Jarrell, Mark

2010-05-01

We present an algorithm for the analytic continuation of imaginary-time quantum Monte Carlo data which is strictly based on principles of Bayesian statistical inference. Within this framework we are able to obtain an explicit expression for the calculation of a weighted average over possible energy spectra, which can be evaluated by standard Monte Carlo simulations, yielding as by-product also the distribution function as function of the regularization parameter. Our algorithm thus avoids the usual ad hoc assumptions introduced in similar algorithms to fix the regularization parameter. We apply the algorithm to imaginary-time quantum Monte Carlo data and compare the resulting energy spectra with those from a standard maximum-entropy calculation.
Inferring Phylogenetic Networks from Gene Order Data

Directory of Open Access Journals (Sweden)

Alexey Anatolievich Morozov

2013-01-01

Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.
Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana

KAUST Repository

Muraro, D.; Voss, U.; Wilson, M.; Bennett, M.; Byrne, H.; De Smet, I.; Hodgman, C.; King, J.

2013-01-01

thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based
Inference of gene-phenotype associations via protein-protein interaction and orthology.

Directory of Open Access Journals (Sweden)

Panwen Wang

Full Text Available One of the fundamental goals of genetics is to understand gene functions and their associated phenotypes. To achieve this goal, in this study we developed a computational algorithm that uses orthology and protein-protein interaction information to infer gene-phenotype associations for multiple species. Furthermore, we developed a web server that provides genome-wide phenotype inference for six species: fly, human, mouse, worm, yeast, and zebrafish. We evaluated our inference method by comparing the inferred results with known gene-phenotype associations. The high Area Under the Curve values suggest a significant performance of our method. By applying our method to two human representative diseases, Type 2 Diabetes and Breast Cancer, we demonstrated that our method is able to identify related Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways. The web server can be used to infer functions and putative phenotypes of a gene along with the candidate genes of a phenotype, and thus aids in disease candidate gene discovery. Our web server is available at http://jjwanglab.org/PhenoPPIOrth.

Spatial Inference Based on Geometric Proportional Analogies

OpenAIRE

Mullally, Emma-Claire; O'Donoghue, Diarmuid P.

2006-01-01

We describe an instance-based reasoning solution to a variety of spatial reasoning problems. The solution centers on identifying an isomorphic mapping between labelled graphs that represent some problem data and a known solution instance. We describe a number of spatial reasoning problems that are solved by generating non-deductive inferences, integrating topology with area (and other) features. We report the accuracy of our algorithm on different categories of spatial reasoning tasks from th...
Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

Science.gov (United States)

Patel, Nihir; Wang, Jason T L

2015-10-01

Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.
A Cautionary Analysis of STAPLE Using Direct Inference of Segmentation Truth

DEFF Research Database (Denmark)

Van Leemput, Koen; Sabuncu, Mert R.

2014-01-01

In this paper we analyze the properties of the well-known segmentation fusion algorithm STAPLE, using a novel inference technique that analytically marginalizes out all model parameters. We demonstrate both theoretically and empirically that when the number of raters is large, or when consensus r...
A Hybrid Approach Based on the Combination of Adaptive Neuro-Fuzzy Inference System and Imperialist Competitive Algorithm: Oil Flow Rate of the Wells Prediction Case Study

Directory of Open Access Journals (Sweden)

Shahram Mollaiy Berneti

2013-04-01

Full Text Available In this paper, a novel hybrid approach composed of adaptive neuro-fuzzy inference system (ANFIS and imperialist competitive algorithm is proposed. The imperialist competitive algorithm (ICA is used in this methodology to determine the most suitable initial membership functions of the ANFIS. The proposed model combines the global search ability of ICA with local search ability of gradient descent method. To illustrate the suitability and capability of the proposed model, this model is applied to predict oil flow rate of the wells utilizing data set of 31 wells in one of the northern Persian Gulf oil fields of Iran. The data set collected in a three month period for each well from Dec. 2002 to Nov. 2010. For the sake of performance evaluation, the results of the proposed model are compared with the conventional ANFIS model. The results show that the significant improvements are achievable using the proposed model in comparison with the results obtained by conventional ANFIS.
Inferring transcriptional compensation interactions in yeast via stepwise structure equation modeling

Directory of Open Access Journals (Sweden)

Wang Woei-Fuh

2008-03-01

Full Text Available Abstract Background With the abundant information produced by microarray technology, various approaches have been proposed to infer transcriptional regulatory networks. However, few approaches have studied subtle and indirect interaction such as genetic compensation, the existence of which is widely recognized although its mechanism has yet to be clarified. Furthermore, when inferring gene networks most models include only observed variables whereas latent factors, such as proteins and mRNA degradation that are not measured by microarrays, do participate in networks in reality. Results Motivated by inferring transcriptional compensation (TC interactions in yeast, a stepwise structural equation modeling algorithm (SSEM is developed. In addition to observed variables, SSEM also incorporates hidden variables to capture interactions (or regulations from latent factors. Simulated gene networks are used to determine with which of six possible model selection criteria (MSC SSEM works best. SSEM with Bayesian information criterion (BIC results in the highest true positive rates, the largest percentage of correctly predicted interactions from all existing interactions, and the highest true negative (non-existing interactions rates. Next, we apply SSEM using real microarray data to infer TC interactions among (1 small groups of genes that are synthetic sick or lethal (SSL to SGS1, and (2 a group of SSL pairs of 51 yeast genes involved in DNA synthesis and repair that are of interest. For (1, SSEM with BIC is shown to outperform three Bayesian network algorithms and a multivariate autoregressive model, checked against the results of qRT-PCR experiments. The predictions for (2 are shown to coincide with several known pathways of Sgs1 and its partners that are involved in DNA replication, recombination and repair. In addition, experimentally testable interactions of Rad27 are predicted. Conclusion SSEM is a useful tool for inferring genetic networks, and the
Data Provenance Inference in Logic Programming: Reducing Effort of Instance-driven Debugging

NARCIS (Netherlands)

Huq, M.R.; Mileo, Alessandra; Wombacher, Andreas

Data provenance allows scientists in different domains validating their models and algorithms to find out anomalies and unexpected behaviors. In previous works, we described on-the-fly interpretation of (Python) scripts to build workflow provenance graph automatically and then infer fine-grained
Hardware Acceleration of Adaptive Neural Algorithms.

Energy Technology Data Exchange (ETDEWEB)

James, Conrad D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-11-01

As tradit ional numerical computing has faced challenges, researchers have turned towards alternative computing approaches to reduce power - per - computation metrics and improve algorithm performance. Here, we describe an approach towards non - conventional computing that strengthens the connection between machine learning and neuroscience concepts. The Hardware Acceleration of Adaptive Neural Algorithms (HAANA) project ha s develop ed neural machine learning algorithms and hardware for applications in image processing and cybersecurity. While machine learning methods are effective at extracting relevant features from many types of data, the effectiveness of these algorithms degrades when subjected to real - world conditions. Our team has generated novel neural - inspired approa ches to improve the resiliency and adaptability of machine learning algorithms. In addition, we have also designed and fabricated hardware architectures and microelectronic devices specifically tuned towards the training and inference operations of neural - inspired algorithms. Finally, our multi - scale simulation framework allows us to assess the impact of microelectronic device properties on algorithm performance.
An Energy-Efficient and Scalable Deep Learning/Inference Processor With Tetra-Parallel MIMD Architecture for Big Data Applications.

Science.gov (United States)

Park, Seong-Wook; Park, Junyoung; Bong, Kyeongryeol; Shin, Dongjoo; Lee, Jinmook; Choi, Sungpill; Yoo, Hoi-Jun

2015-12-01

Deep Learning algorithm is widely used for various pattern recognition applications such as text recognition, object recognition and action recognition because of its best-in-class recognition accuracy compared to hand-crafted algorithm and shallow learning based algorithms. Long learning time caused by its complex structure, however, limits its usage only in high-cost servers or many-core GPU platforms so far. On the other hand, the demand on customized pattern recognition within personal devices will grow gradually as more deep learning applications will be developed. This paper presents a SoC implementation to enable deep learning applications to run with low cost platforms such as mobile or portable devices. Different from conventional works which have adopted massively-parallel architecture, this work adopts task-flexible architecture and exploits multiple parallelism to cover complex functions of convolutional deep belief network which is one of popular deep learning/inference algorithms. In this paper, we implement the most energy-efficient deep learning and inference processor for wearable system. The implemented 2.5 mm × 4.0 mm deep learning/inference processor is fabricated using 65 nm 8-metal CMOS technology for a battery-powered platform with real-time deep inference and deep learning operation. It consumes 185 mW average power, and 213.1 mW peak power at 200 MHz operating frequency and 1.2 V supply voltage. It achieves 411.3 GOPS peak performance and 1.93 TOPS/W energy efficiency, which is 2.07× higher than the state-of-the-art.
Inference of neuronal network spike dynamics and topology from calcium imaging data

Directory of Open Access Journals (Sweden)

Henry eLütcke

2013-12-01

Full Text Available Two-photon calcium imaging enables functional analysis of neuronal circuits by inferring action potential (AP occurrence ('spike trains' from cellular fluorescence signals. It remains unclear how experimental parameters such as signal-to-noise ratio (SNR and acquisition rate affect spike inference and whether additional information about network structure can be extracted. Here we present a simulation framework for quantitatively assessing how well spike dynamics and network topology can be inferred from noisy calcium imaging data. For simulated AP-evoked calcium transients in neocortical pyramidal cells, we analyzed the quality of spike inference as a function of SNR and data acquisition rate using a recently introduced peeling algorithm. Given experimentally attainable values of SNR and acquisition rate, neural spike trains could be reconstructed accurately and with up to millisecond precision. We then applied statistical neuronal network models to explore how remaining uncertainties in spike inference affect estimates of network connectivity and topological features of network organization. We define the experimental conditions suitable for inferring whether the network has a scale-free structure and determine how well hub neurons can be identified. Our findings provide a benchmark for future calcium imaging studies that aim to reliably infer neuronal network properties.
Inference of gene regulatory networks from time series by Tsallis entropy

Directory of Open Access Journals (Sweden)

de Oliveira Evaldo A

2011-05-01

Full Text Available Abstract Background The inference of gene regulatory networks (GRNs from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information, a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 �
Fetal ECG extraction via Type-2 adaptive neuro-fuzzy inference systems.

Science.gov (United States)

Ahmadieh, Hajar; Asl, Babak Mohammadzadeh

2017-04-01

We proposed a noninvasive method for separating the fetal ECG (FECG) from maternal ECG (MECG) by using Type-2 adaptive neuro-fuzzy inference systems. The method can extract FECG components from abdominal signal by using one abdominal channel, including maternal and fetal cardiac signals and other environmental noise signals, and one chest channel. The proposed algorithm detects the nonlinear dynamics of the mother's body. So, the components of the MECG are estimated from the abdominal signal. By subtracting estimated mother cardiac signal from abdominal signal, fetal cardiac signal can be extracted. This algorithm was applied on synthetic ECG signals generated based on the models developed by McSharry et al. and Behar et al. and also on DaISy real database. In environments with high uncertainty, our method performs better than the Type-1 fuzzy method. Specifically, in evaluation of the algorithm with the synthetic data based on McSharry model, for input signals with SNR of -5dB, the SNR of the extracted FECG was improved by 38.38% in comparison with the Type-1 fuzzy method. Also, the results show that increasing the uncertainty or decreasing the input SNR leads to increasing the percentage of the improvement in SNR of the extracted FECG. For instance, when the SNR of the input signal decreases to -30dB, our proposed algorithm improves the SNR of the extracted FECG by 71.06% with respect to the Type-1 fuzzy method. The same results were obtained on synthetic data based on Behar model. Our results on real database reflect the success of the proposed method to separate the maternal and fetal heart signals even if their waves overlap in time. Moreover, the proposed algorithm was applied to the simulated fetal ECG with ectopic beats and achieved good results in separating FECG from MECG. The results show the superiority of the proposed Type-2 neuro-fuzzy inference method over the Type-1 neuro-fuzzy inference and the polynomial networks methods, which is due to its
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling

International Nuclear Information System (INIS)

Li Yupeng; Deutsch, Clayton V.

2012-01-01

In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells. In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.
Assessment of algorithms for inferring positional weight matrix motifs of transcription factor binding sites using protein binding microarray data.

Directory of Open Access Journals (Sweden)

Yaron Orenstein

Full Text Available The new technology of protein binding microarrays (PBMs allows simultaneous measurement of the binding intensities of a transcription factor to tens of thousands of synthetic double-stranded DNA probes, covering all possible 10-mers. A key computational challenge is inferring the binding motif from these data. We present a systematic comparison of four methods developed specifically for reconstructing a binding site motif represented as a positional weight matrix from PBM data. The reconstructed motifs were evaluated in terms of three criteria: concordance with reference motifs from the literature and ability to predict in vivo and in vitro bindings. The evaluation encompassed over 200 transcription factors and some 300 assays. The results show a tradeoff between how the methods perform according to the different criteria, and a dichotomy of method types. Algorithms that construct motifs with low information content predict PBM probe ranking more faithfully, while methods that produce highly informative motifs match reference motifs better. Interestingly, in predicting high-affinity binding, all methods give far poorer results for in vivo assays compared to in vitro assays.
Inferring time derivatives including cell growth rates using Gaussian processes

Science.gov (United States)

Swain, Peter S.; Stevenson, Keiran; Leary, Allen; Montano-Gutierrez, Luis F.; Clark, Ivan B. N.; Vogel, Jackie; Pilizota, Teuta

2016-12-01

Often the time derivative of a measured variable is of as much interest as the variable itself. For a growing population of biological cells, for example, the population's growth rate is typically more important than its size. Here we introduce a non-parametric method to infer first and second time derivatives as a function of time from time-series data. Our approach is based on Gaussian processes and applies to a wide range of data. In tests, the method is at least as accurate as others, but has several advantages: it estimates errors both in the inference and in any summary statistics, such as lag times, and allows interpolation with the corresponding error estimation. As illustrations, we infer growth rates of microbial cells, the rate of assembly of an amyloid fibril and both the speed and acceleration of two separating spindle pole bodies. Our algorithm should thus be broadly applicable.
Belief propagation and replicas for inference and learning in a kinetic Ising model with hidden spins

International Nuclear Information System (INIS)

Battistin, C; Roudi, Y; Hertz, J; Tyrcha, J

2015-01-01

We propose a new algorithm for inferring the state of hidden spins and reconstructing the connections in a synchronous kinetic Ising model, given the observed history. Focusing on the case in which the hidden spins are conditionally independent of each other given the state of observable spins, we show that calculating the likelihood of the data can be simplified by introducing a set of replicated auxiliary spins. Belief propagation (BP) and susceptibility propagation (SusP) can then be used to infer the states of hidden variables and to learn the couplings. We study the convergence and performance of this algorithm for networks with both Gaussian-distributed and binary bonds. We also study how the algorithm behaves as the fraction of hidden nodes and the amount of data are changed, showing that it outperforms the Thouless–Anderson–Palmer (TAP) equations for reconstructing the connections. (paper)
STATISTICAL RELATIONAL LEARNING AND SCRIPT INDUCTION FOR TEXTUAL INFERENCE

Science.gov (United States)

2017-12-01

compensate for parser errors. We replace deterministic conjunction by an average combiner, which encodes causal independence. Our framework was the...sentence similarity (STS) and sentence paraphrasing, but not Textual Entailment, where deeper inferences are required. As the formula for conjunction ...When combined, our algorithm learns to rely on systems that not just agree on an output but also the provenance of this output in conjunction with the
A Demosaicking Algorithm with Adaptive Inter-Channel Correlation

Directory of Open Access Journals (Sweden)

Joan Duran

2015-12-01

Full Text Available Most common cameras use a CCD sensor device measuring a single color per pixel. Demosaicking is the interpolation process by which one can infer a full color image from such a matrix of values, thus interpolating the two missing components per pixel. Most demosaicking methods take advantage of inter-channel correlation locally selecting the best interpolation direction. The obtained results look convincing except when local geometry cannot be inferred from neighboring pixels or channel correlation is low. In these cases, these algorithms create interpolation artifacts such as zipper effect or color aliasing. This paper discusses the implementation details of the algorithm proposed in [J. Duran, A. Buades, ``Self-Similarity and Spectral Correlation Adaptive Algorithm for Color Demosaicking'', IEEE Transactions on Image Processing, 23(9, pp. 4031--4040, 2014]. The proposed method involves nonlocal image self-similarity in order to reduce interpolation artifacts when local geometry is ambiguous. It further introduces a clear and intuitive manner of balancing how much channel-correlation must be taken advantage of.
The Forward-Reverse Algorithm for Stochastic Reaction Networks

KAUST Repository

Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro

2015-01-01

In this work, we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem
A neuro-fuzzy inference system for sensor failure detection using wavelet denoising, PCA and SPRT

International Nuclear Information System (INIS)

Na, Man Gyun

2001-01-01

In this work, a neuro-fuzzy inference system combined with the wavelet denoising, PCA(principal component analysis) and SPRT (sequential probability ratio test) methods is developed to detect the relevant sensor failure using other sensor signals. The wavelet denoising technique is applied to remove noise components in input signals into the neuro-fuzzy system. The PCA is used to reduce the dimension of an input space without losing a significant amount of information, The PCA makes easy the selection of the input signals into the neuro-fuzzy system. Also, a lower dimensional input space usually reduces the time necessary to train a neuro-fuzzy system. The parameters of the neuro-fuzzy inference system which estimates the relevant sensor signal are optimized by a genetic algorithm and a least-squares algorithm. The residuals between the estimated signals and the measured signals are used to detect whether the sensors are failed or not. The SPRT is used in this failure detection algorithm. The proposed sensor-monitoring algorithm was verified through applications to the pressurizer water level and the hot-leg flowrate sensors in pressurized water reactors
Expectation propagation for large scale Bayesian inference of non-linear molecular networks from perturbation data.

Science.gov (United States)

Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger

2017-01-01

Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.

Protein-DNA binding dynamics predict transcriptional response to nutrients in archaea.

Science.gov (United States)

Todor, Horia; Sharma, Kriti; Pittman, Adrianne M C; Schmid, Amy K

2013-10-01

Organisms across all three domains of life use gene regulatory networks (GRNs) to integrate varied stimuli into coherent transcriptional responses to environmental pressures. However, inferring GRN topology and regulatory causality remains a central challenge in systems biology. Previous work characterized TrmB as a global metabolic transcription factor in archaeal extremophiles. However, it remains unclear how TrmB dynamically regulates its ∼100 metabolic enzyme-coding gene targets. Using a dynamic perturbation approach, we elucidate the topology of the TrmB metabolic GRN in the model archaeon Halobacterium salinarum. Clustering of dynamic gene expression patterns reveals that TrmB functions alone to regulate central metabolic enzyme-coding genes but cooperates with various regulators to control peripheral metabolic pathways. Using a dynamical model, we predict gene expression patterns for some TrmB-dependent promoters and infer secondary regulators for others. Our data suggest feed-forward gene regulatory topology for cobalamin biosynthesis. In contrast, purine biosynthesis appears to require TrmB-independent regulators. We conclude that TrmB is an important component for mediating metabolic modularity, integrating nutrient status and regulating gene expression dynamics alone and in concert with secondary regulators.
Fast Bayesian Inference in Dirichlet Process Mixture Models.

Science.gov (United States)

Wang, Lianming; Dunson, David B

2011-01-01

There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
The Forward-Reverse Algorithm for Stochastic Reaction Networks

KAUST Repository

Bayer, Christian

2015-01-07

In this work, we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem of approximating the reaction coefficients based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which we solve a set of deterministic optimization problems where the SRNs are replaced by the classical ODE rates; then, during the second phase, the Monte Carlo version of the EM algorithm is applied starting from the output of the previous phase. Starting from a set of over-dispersed seeds, the output of our two-phase method is a cluster of maximum likelihood estimates obtained by using convergence assessment techniques from the theory of Markov chain Monte Carlo.
Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors

Directory of Open Access Journals (Sweden)

Frank Emmert-Streib

2013-02-01

Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.
Applying a multiobjective metaheuristic inspired by honey bees to phylogenetic inference.

Science.gov (United States)

Santander-Jiménez, Sergio; Vega-Rodríguez, Miguel A

2013-10-01

The development of increasingly popular multiobjective metaheuristics has allowed bioinformaticians to deal with optimization problems in computational biology where multiple objective functions must be taken into account. One of the most relevant research topics that can benefit from these techniques is phylogenetic inference. Throughout the years, different researchers have proposed their own view about the reconstruction of ancestral evolutionary relationships among species. As a result, biologists often report different phylogenetic trees from a same dataset when considering distinct optimality principles. In this work, we detail a multiobjective swarm intelligence approach based on the novel Artificial Bee Colony algorithm for inferring phylogenies. The aim of this paper is to propose a complementary view of phylogenetics according to the maximum parsimony and maximum likelihood criteria, in order to generate a set of phylogenetic trees that represent a compromise between these principles. Experimental results on a variety of nucleotide data sets and statistical studies highlight the relevance of the proposal with regard to other multiobjective algorithms and state-of-the-art biological methods. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Maximum Likelihood Method for Predicting Environmental Conditions from Assemblage Composition: The R Package bio.infer

Directory of Open Access Journals (Sweden)

Lester L. Yuan

2007-06-01

Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.
Inferring Trial-to-Trial Excitatory and Inhibitory Synaptic Inputs from Membrane Potential using Gaussian Mixture Kalman Filtering

Directory of Open Access Journals (Sweden)

Milad eLankarany

2013-09-01

Full Text Available Time-varying excitatory and inhibitory synaptic inputs govern activity of neurons and process information in the brain. The importance of trial-to-trial fluctuations of synaptic inputs has recently been investigated in neuroscience. Such fluctuations are ignored in the most conventional techniques because they are removed when trials are averaged during linear regression techniques. Here, we propose a novel recursive algorithm based on Gaussian mixture Kalman filtering for estimating time-varying excitatory and inhibitory synaptic inputs from single trials of noisy membrane potential in current clamp recordings. The Kalman filtering is followed by an expectation maximization algorithm to infer the statistical parameters (time-varying mean and variance of the synaptic inputs in a non-parametric manner. As our proposed algorithm is repeated recursively, the inferred parameters of the mixtures are used to initiate the next iteration. Unlike other recent algorithms, our algorithm does not assume an a priori distribution from which the synaptic inputs are generated. Instead, the algorithm recursively estimates such a distribution by fitting a Gaussian mixture model. The performance of the proposed algorithms is compared to a previously proposed PF-based algorithm (Paninski et al., 2012 with several illustrative examples, assuming that the distribution of synaptic input is unknown. If noise is small, the performance of our algorithms is similar to that of the previous one. However, if noise is large, they can significantly outperform the previous proposal. These promising results suggest that our algorithm is a robust and efficient technique for estimating time varying excitatory and inhibitory synaptic conductances from single trials of membrane potential recordings.
Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks.

Science.gov (United States)

Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J

2013-01-01

We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.
Detection of algorithmic trading

Science.gov (United States)

Bogoev, Dimitar; Karam, Arzé

2017-10-01

We develop a new approach to reflect the behavior of algorithmic traders. Specifically, we provide an analytical and tractable way to infer patterns of quote volatility and price momentum consistent with different types of strategies employed by algorithmic traders, and we propose two ratios to quantify these patterns. Quote volatility ratio is based on the rate of oscillation of the best ask and best bid quotes over an extremely short period of time; whereas price momentum ratio is based on identifying patterns of rapid upward or downward movement in prices. The two ratios are evaluated across several asset classes. We further run a two-stage Artificial Neural Network experiment on the quote volatility ratio; the first stage is used to detect the quote volatility patterns resulting from algorithmic activity, while the second is used to validate the quality of signal detection provided by our measure.
Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms.

Science.gov (United States)

Razavi Termeh, Seyed Vahid; Kornejady, Aiding; Pourghasemi, Hamid Reza; Keesstra, Saskia

2018-02-15

Flood is one of the most destructive natural disasters which cause great financial and life losses per year. Therefore, producing susceptibility maps for flood management are necessary in order to reduce its harmful effects. The aim of the present study is to map flood hazard over the Jahrom Township in Fars Province using a combination of adaptive neuro-fuzzy inference systems (ANFIS) with different metaheuristics algorithms such as ant colony optimization (ACO), genetic algorithm (GA), and particle swarm optimization (PSO) and comparing their accuracy. A total number of 53 flood locations areas were identified, 35 locations of which were randomly selected in order to model flood susceptibility and the remaining 16 locations were used to validate the models. Learning vector quantization (LVQ), as one of the supervised neural network methods, was employed in order to estimate factors' importance. Nine flood conditioning factors namely: slope degree, plan curvature, altitude, topographic wetness index (TWI), stream power index (SPI), distance from river, land use/land cover, rainfall, and lithology were selected and the corresponding maps were prepared in ArcGIS. The frequency ratio (FR) model was used to assign weights to each class within particular controlling factor, then the weights was transferred into MATLAB software for further analyses and to combine with metaheuristic models. The ANFIS-PSO was found to be the most practical model in term of producing the highly focused flood susceptibility map with lesser spatial distribution related to highly susceptible classes. The chi-square result attests the same, where the ANFIS-PSO had the highest spatial differentiation within flood susceptibility classes over the study area. The area under the curve (AUC) obtained from ROC curve indicated the accuracy of 91.4%, 91.8%, 92.6% and 94.5% for the respective models of FR, ANFIS-ACO, ANFIS-GA, and ANFIS-PSO ensembles. So, the ensemble of ANFIS-PSO was introduced as the
A Neuro-Fuzzy Inference System Combining Wavelet Denoising, Principal Component Analysis, and Sequential Probability Ratio Test for Sensor Monitoring

International Nuclear Information System (INIS)

Na, Man Gyun; Oh, Seungrohk

2002-01-01

A neuro-fuzzy inference system combined with the wavelet denoising, principal component analysis (PCA), and sequential probability ratio test (SPRT) methods has been developed to monitor the relevant sensor using the information of other sensors. The parameters of the neuro-fuzzy inference system that estimates the relevant sensor signal are optimized by a genetic algorithm and a least-squares algorithm. The wavelet denoising technique was applied to remove noise components in input signals into the neuro-fuzzy system. By reducing the dimension of an input space into the neuro-fuzzy system without losing a significant amount of information, the PCA was used to reduce the time necessary to train the neuro-fuzzy system, simplify the structure of the neuro-fuzzy inference system, and also, make easy the selection of the input signals into the neuro-fuzzy system. By using the residual signals between the estimated signals and the measured signals, the SPRT is applied to detect whether the sensors are degraded or not. The proposed sensor-monitoring algorithm was verified through applications to the pressurizer water level, the pressurizer pressure, and the hot-leg temperature sensors in pressurized water reactors
Inference on inspiral signals using LISA MLDC data

International Nuclear Information System (INIS)

Roever, Christian; Stroeer, Alexander; Bloomer, Ed; Christensen, Nelson; Clark, James; Hendry, Martin; Messenger, Chris; Meyer, Renate; Pitkin, Matt; Toher, Jennifer; Umstaetter, Richard; Vecchio, Alberto; Veitch, John; Woan, Graham

2007-01-01

In this paper, we describe a Bayesian inference framework for the analysis of data obtained by LISA. We set up a model for binary inspiral signals as defined for the Mock LISA Data Challenge 1.2 (MLDC), and implemented a Markov chain Monte Carlo (MCMC) algorithm to facilitate exploration and integration of the posterior distribution over the nine-dimensional parameter space. Here, we present intermediate results showing how, using this method, information about the nine parameters can be extracted from the data
No interpretation without representation: the role of domain-specific representations and inferences in the Wason selection task.

Science.gov (United States)

Fiddick, L; Cosmides, L; Tooby, J

2000-10-16

The Wason selection task is a tool used to study reasoning about conditional rules. Performance on this task changes systematically when one varies its content, and these content effects have been used to argue that the human cognitive architecture contains a number of domain-specific representation and inference systems, such as social contract algorithms and hazard management systems. Recently, however, Sperber, Cara & Girotto (Sperber, D., Cara, F., & Girotto, V. (1995). Relevance theory explains the selection task. Cognition, 57, 31-95) have proposed that relevance theory can explain performance on the selection task - including all content effects - without invoking inference systems that are content-specialized. Herein, we show that relevance theory alone cannot explain a variety of content effects - effects that were predicted in advance and are parsimoniously explained by theories that invoke domain-specific algorithms for representing and making inferences about (i) social contracts and (ii) reducing risk in hazardous situations. Moreover, although Sperber et al. (1995) were able to use relevance theory to produce some new content effects in other domains, they conducted no experiments involving social exchanges or precautions, and so were unable to determine which - content-specialized algorithms or relevance effects - dominate reasoning when the two conflict. When experiments, reported herein, are constructed so that the different theories predict divergent outcomes, the results support the predictions of social contract theory and hazard management theory, indicating that these inference systems override content-general relevance factors. The fact that social contract and hazard management algorithms provide better explanations for performance in their respective domains does not mean that the content-general logical procedures posited by relevance theory do not exist, or that relevance effects never occur. It does mean, however, that one needs a
Bayesian inference for Markov jump processes with informative observations.

Science.gov (United States)

Golightly, Andrew; Wilkinson, Darren J

2015-04-01

In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.
Sign Inference for Dynamic Signed Networks via Dictionary Learning

Directory of Open Access Journals (Sweden)

Yi Cen

2013-01-01

Full Text Available Mobile online social network (mOSN is a burgeoning research area. However, most existing works referring to mOSNs deal with static network structures and simply encode whether relationships among entities exist or not. In contrast, relationships in signed mOSNs can be positive or negative and may be changed with time and locations. Applying certain global characteristics of social balance, in this paper, we aim to infer the unknown relationships in dynamic signed mOSNs and formulate this sign inference problem as a low-rank matrix estimation problem. Specifically, motivated by the Singular Value Thresholding (SVT algorithm, a compact dictionary is selected from the observed dataset. Based on this compact dictionary, the relationships in the dynamic signed mOSNs are estimated via solving the formulated problem. Furthermore, the estimation accuracy is improved by employing a dictionary self-updating mechanism.
Algorithmic detectability threshold of the stochastic block model

Science.gov (United States)

Kawamoto, Tatsuro

2018-03-01

The assumption that the values of model parameters are known or correctly learned, i.e., the Nishimori condition, is one of the requirements for the detectability analysis of the stochastic block model in statistical inference. In practice, however, there is no example demonstrating that we can know the model parameters beforehand, and there is no guarantee that the model parameters can be learned accurately. In this study, we consider the expectation-maximization (EM) algorithm with belief propagation (BP) and derive its algorithmic detectability threshold. Our analysis is not restricted to the community structure but includes general modular structures. Because the algorithm cannot always learn the planted model parameters correctly, the algorithmic detectability threshold is qualitatively different from the one with the Nishimori condition.
Simulation and Statistical Inference of Stochastic Reaction Networks with Applications to Epidemic Models

KAUST Repository

Moraes, Alvaro

2015-01-01

Epidemics have shaped, sometimes more than wars and natural disasters, demo- graphic aspects of human populations around the world, their health habits and their economies. Ebola and the Middle East Respiratory Syndrome (MERS) are clear and current examples of potential hazards at planetary scale. During the spread of an epidemic disease, there are phenomena, like the sudden extinction of the epidemic, that can not be captured by deterministic models. As a consequence, stochastic models have been proposed during the last decades. A typical forward problem in the stochastic setting could be the approximation of the expected number of infected individuals found in one month from now. On the other hand, a typical inverse problem could be, given a discretely observed set of epidemiological data, infer the transmission rate of the epidemic or its basic reproduction number. Markovian epidemic models are stochastic models belonging to a wide class of pure jump processes known as Stochastic Reaction Networks (SRNs), that are intended to describe the time evolution of interacting particle systems where one particle interacts with the others through a finite set of reaction channels. SRNs have been mainly developed to model biochemical reactions but they also have applications in neural networks, virus kinetics, and dynamics of social networks, among others. 4 This PhD thesis is focused on novel fast simulation algorithms and statistical inference methods for SRNs. Our novel Multi-level Monte Carlo (MLMC) hybrid simulation algorithms provide accurate estimates of expected values of a given observable of SRNs at a prescribed final time. They are designed to control the global approximation error up to a user-selected accuracy and up to a certain confidence level, and with near optimal computational work. We also present novel dual-weighted residual expansions for fast estimation of weak and strong errors arising from the MLMC methodology. Regarding the statistical inference
Convergent cross-mapping and pairwise asymmetric inference.

Science.gov (United States)

McCracken, James M; Weigel, Robert S

2014-12-01

Convergent cross-mapping (CCM) is a technique for computing specific kinds of correlations between sets of times series. It was introduced by Sugihara et al. [Science 338, 496 (2012).] and is reported to be "a necessary condition for causation" capable of distinguishing causality from standard correlation. We show that the relationships between CCM correlations proposed by Sugihara et al. do not, in general, agree with intuitive concepts of "driving" and as such should not be considered indicative of causality. It is shown that the fact that the CCM algorithm implies causality is a function of system parameters for simple linear and nonlinear systems. For example, in a circuit containing a single resistor and inductor, both voltage and current can be identified as the driver depending on the frequency of the source voltage. It is shown that the CCM algorithm, however, can be modified to identify relationships between pairs of time series that are consistent with intuition for the considered example systems for which CCM causality analysis provided nonintuitive driver identifications. This modification of the CCM algorithm is introduced as "pairwise asymmetric inference" (PAI) and examples of its use are presented.
Inference and interrogation of a coregulatory network in the context of lipid accumulation in Yarrowia lipolytica.

Science.gov (United States)

Trébulle, Pauline; Nicaud, Jean-Marc; Leplat, Christophe; Elati, Mohamed

2017-01-01

Complex phenotypes, such as lipid accumulation, result from cooperativity between regulators and the integration of multiscale information. However, the elucidation of such regulatory programs by experimental approaches may be challenging, particularly in context-specific conditions. In particular, we know very little about the regulators of lipid accumulation in the oleaginous yeast of industrial interest Yarrowia lipolytica . This lack of knowledge limits the development of this yeast as an industrial platform, due to the time-consuming and costly laboratory efforts required to design strains with the desired phenotypes. In this study, we aimed to identify context-specific regulators and mechanisms, to guide explorations of the regulation of lipid accumulation in Y. lipolytica . Using gene regulatory network inference, and considering the expression of 6539 genes over 26 time points from GSE35447 for biolipid production and a list of 151 transcription factors, we reconstructed a gene regulatory network comprising 111 transcription factors, 4451 target genes and 17048 regulatory interactions (YL-GRN-1) supported by evidence of protein-protein interactions. This study, based on network interrogation and wet laboratory validation (a) highlights the relevance of our proposed measure, the transcription factors influence, for identifying phases corresponding to changes in physiological state without prior knowledge (b) suggests new potential regulators and drivers of lipid accumulation and (c) experimentally validates the impact of six of the nine regulators identified on lipid accumulation, with variations in lipid content from +43.2% to -31.2% on glucose or glycerol.
Fast half-sibling population reconstruction: theory and algorithms.

Science.gov (United States)

Dexter, Daniel; Brown, Daniel G

2013-07-12

Kinship inference is the task of identifying genealogically related individuals. Kinship information is important for determining mating structures, notably in endangered populations. Although many solutions exist for reconstructing full sibling relationships, few exist for half-siblings. We consider the problem of determining whether a proposed half-sibling population reconstruction is valid under Mendelian inheritance assumptions. We show that this problem is NP-complete and provide a 0/1 integer program that identifies the minimum number of individuals that must be removed from a population in order for the reconstruction to become valid. We also present SibJoin, a heuristic-based clustering approach based on Mendelian genetics, which is strikingly fast. The software is available at http://github.com/ddexter/SibJoin.git+. Our SibJoin algorithm is reasonably accurate and thousands of times faster than existing algorithms. The heuristic is used to infer a half-sibling structure for a population which was, until recently, too large to evaluate.

A Local Poisson Graphical Model for inferring networks from sequencing data.

Science.gov (United States)

Allen, Genevera I; Liu, Zhandong

2013-09-01

Gaussian graphical models, a class of undirected graphs or Markov Networks, are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies such as RNA-sequencing or next generation sequencing to measure gene expression. As the resulting data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for this discrete data. In this paper, we propose a novel method for inferring gene networks from sequencing data: the Local Poisson Graphical Model. Our model assumes a Local Markov property where each variable conditional on all other variables is Poisson distributed. We develop a neighborhood selection algorithm to fit our model locally by performing a series of l1 penalized Poisson, or log-linear, regressions. This yields a fast parallel algorithm for estimating networks from next generation sequencing data. In simulations, we illustrate the effectiveness of our methods for recovering network structure from count data. A case study on breast cancer microRNAs (miRNAs), a novel application of graphical models, finds known regulators of breast cancer genes and discovers novel miRNA clusters and hubs that are targets for future research.
Design and simplification of Adaptive Neuro-Fuzzy Inference Controllers for power plants

Energy Technology Data Exchange (ETDEWEB)

Alturki, F.A.; Abdennour, A. [King Saud University, Riyadh (Saudi Arabia). Electrical Engineering Dept.

1999-10-01

This article presents the design of an Adaptive Neuro-Fuzzy Inference Controller (ANFIC) for a 160 MW power plant. The space of operating conditions of the plant is partitioned into five regions. For each of the regions, an optimal controller is designed to meet a set of design objectives. The resulting five linear controllers are used to train the ANFIC. To enhance the applicability of the control system, a new algorithm that reduces the fuzzy rules to the most essential ones is also presented. This algorithm offers substantial savings in computation time while maintaining the performance and robustness of the original controller. (author)
Progranulin haploinsufficiency causes biphasic social dominance abnormalities in the tube test.

Science.gov (United States)

Arrant, A E; Filiano, A J; Warmus, B A; Hall, A M; Roberson, E D

2016-07-01

Loss-of-function mutations in progranulin (GRN) are a major autosomal dominant cause of frontotemporal dementia (FTD), a neurodegenerative disorder in which social behavior is disrupted. Progranulin-insufficient mice, both Grn(+/-) and Grn(-/-) , are used as models of FTD due to GRN mutations, with Grn(+/-) mice mimicking the progranulin haploinsufficiency of FTD patients with GRN mutations. Grn(+/-) mice have increased social dominance in the tube test at 6 months of age, although this phenotype has not been reported in Grn(-/-) mice. In this study, we investigated how the tube test phenotype of progranulin-insufficient mice changes with age, determined its robustness under several testing conditions, and explored the associated cellular mechanisms. We observed biphasic social dominance abnormalities in Grn(+/-) mice: at 6-8 months, Grn(+/-) mice were more dominant than wild-type littermates, while after 9 months of age, Grn(+/-) mice were less dominant. In contrast, Grn(-/-) mice did not exhibit abnormal social dominance, suggesting that progranulin haploinsufficiency has distinct effects from complete progranulin deficiency. The biphasic tube test phenotype of Grn(+/-) mice was associated with abnormal cellular signaling and neuronal morphology in the amygdala and prefrontal cortex. At 6-9 months, Grn(+/-) mice exhibited increased mTORC2/Akt signaling in the amygdala and enhanced dendritic arbors in the basomedial amygdala, and at 9-16 months Grn(+/-) mice exhibited diminished basal dendritic arbors in the prelimbic cortex. These data show a progressive change in tube test dominance in Grn(+/-) mice and highlight potential underlying mechanisms by which progranulin insufficiency may disrupt social behavior. © 2016 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Gauging Variational Inference

Energy Technology Data Exchange (ETDEWEB)

Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahn, Sungsoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Shin, Jinwoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of)

2017-05-25

Computing partition function is the most important statistical inference task arising in applications of Graphical Models (GM). Since it is computationally intractable, approximate methods have been used to resolve the issue in practice, where meanfield (MF) and belief propagation (BP) are arguably the most popular and successful approaches of a variational type. In this paper, we propose two new variational schemes, coined Gauged-MF (G-MF) and Gauged-BP (G-BP), improving MF and BP, respectively. Both provide lower bounds for the partition function by utilizing the so-called gauge transformation which modifies factors of GM while keeping the partition function invariant. Moreover, we prove that both G-MF and G-BP are exact for GMs with a single loop of a special structure, even though the bare MF and BP perform badly in this case. Our extensive experiments, on complete GMs of relatively small size and on large GM (up-to 300 variables) confirm that the newly proposed algorithms outperform and generalize MF and BP.
Neurogenetic Algorithm for Solving Combinatorial Engineering Problems

Directory of Open Access Journals (Sweden)

M. Jalali Varnamkhasti

2012-01-01

Full Text Available Diversity of the population in a genetic algorithm plays an important role in impeding premature convergence. This paper proposes an adaptive neurofuzzy inference system genetic algorithm based on sexual selection. In this technique, for choosing the female chromosome during sexual selection, a bilinear allocation lifetime approach is used to label the chromosomes based on their fitness value which will then be used to characterize the diversity of the population. The motivation of this algorithm is to maintain the population diversity throughout the search procedure. To promote diversity, the proposed algorithm combines the concept of gender and age of individuals and the fuzzy logic during the selection of parents. In order to appraise the performance of the techniques used in this study, one of the chemistry problems and some nonlinear functions available in literature is used.
Bootstrapping phylogenies inferred from rearrangement data

Directory of Open Access Journals (Sweden)

Lin Yu

2012-08-01

Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its
Bootstrapping phylogenies inferred from rearrangement data.

Science.gov (United States)

Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me

2012-08-29

Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver
Clonorchis sinensis granulin: identification, immunolocalization, and function in promoting the metastasis of cholangiocarcinoma and hepatocellular carcinoma.

Science.gov (United States)

Wang, Caiqin; Lei, Huali; Tian, Yanli; Shang, Mei; Wu, Yinjuan; Li, Ye; Zhao, Lu; Shi, Mengchen; Tang, Xin; Chen, Tingjin; Lv, Zhiyue; Huang, Yan; Tang, Xiaoping; Yu, Xinbing; Li, Xuerong

2017-05-25

Long-term infections by Clonorchis sinensis are associated with cholangitis, cholecystitis, liver fibrosis, cirrhosis, and even liver cancer. Molecules from the worm play vital roles in disease progress. In the present study, we identified and explored molecular characterization of C. sinensis granulin (CsGRN), a growth factor-like protein from C. sinensis excretory/secretory products (CsESPs). The encoding sequence and conserved domains of CsGRN were identified and analysed by bioinformatics tools. Recombinant CsGRN (rCsGRN) protein was expressed in Escherichia coli BL21 (DE3). The localisation of CsGRN in adult worms and Balb/c mice infected with C. sinensis was investigated by immunofluorescence and immunohistochemistry, respectively. Stable CsGRN-overexpressed cell lines of hepatoma cells (PLC-GRN cells) and cholangiocarcinoma cells (RBE-GRN cells) were constructed by transfection of eukaryotic expression plasmid of pEGFP-C1-CsGRN. The effects on cell migration and invasion of CsGRN were assessed through the wound-healing assay and transwell assay. The levels of matrix metalloproteinase 2 and 9 (MMP2 and MMP9) in PLC-GRN or RBE-GRN cells were detected by real-time PCR (qRT-PCR). The levels of E-cadherin, vimentin, N-cadherin, zona occludens proteins (ZO-1), β-catenin, phosphorylated ERK (p-ERK) and phosphorylated AKT (p-AKT) were analysed by Western blotting. CsGRN, including the conserved GRN domains, was confirmed to be a member of the granulin family. CsGRN was identified as an ingredient of CsESPs. CsGRN was localised in the tegument and testes of the adult worm. Furthermore, it appeared in the cytoplasm of hepatocytes and biliary epithelium cells from infected Balb/c mouse. The enhancement of cell migration and invasion of PLC-GRN and RBE-GRN cells were observed. In addition, CsGRN upregulated the levels of vimentin, N-cadherin, β-catenin, MMP2 and MMP9, while it downregulated the level of ZO-1 in PLC-GRN/RBE-GRN cells. In total proteins of liver tissue
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction

Directory of Open Access Journals (Sweden)

Seong-Gon Kim

2011-06-01

Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Automatic segmentation of coronary angiograms based on fuzzy inferring and probabilistic tracking

Directory of Open Access Journals (Sweden)

Shoujun Zhou

2010-08-01

Full Text Available Abstract Background Segmentation of the coronary angiogram is important in computer-assisted artery motion analysis or reconstruction of 3D vascular structures from a single-plan or biplane angiographic system. Developing fully automated and accurate vessel segmentation algorithms is highly challenging, especially when extracting vascular structures with large variations in image intensities and noise, as well as with variable cross-sections or vascular lesions. Methods This paper presents a novel tracking method for automatic segmentation of the coronary artery tree in X-ray angiographic images, based on probabilistic vessel tracking and fuzzy structure pattern inferring. The method is composed of two main steps: preprocessing and tracking. In preprocessing, multiscale Gabor filtering and Hessian matrix analysis were used to enhance and extract vessel features from the original angiographic image, leading to a vessel feature map as well as a vessel direction map. In tracking, a seed point was first automatically detected by analyzing the vessel feature map. Subsequently, two operators [e.g., a probabilistic tracking operator (PTO and a vessel structure pattern detector (SPD] worked together based on the detected seed point to extract vessel segments or branches one at a time. The local structure pattern was inferred by a multi-feature based fuzzy inferring function employed in the SPD. The identified structure pattern, such as crossing or bifurcation, was used to control the tracking process, for example, to keep tracking the current segment or start tracking a new one, depending on the detected pattern. Results By appropriate integration of these advanced preprocessing and tracking steps, our tracking algorithm is able to extract both vessel axis lines and edge points, as well as measure the arterial diameters in various complicated cases. For example, it can walk across gaps along the longitudinal vessel direction, manage varying vessel
Cognitive Inference Device for Activity Supervision in the Elderly

Directory of Open Access Journals (Sweden)

Nilamadhab Mishra

2014-01-01

Full Text Available Human activity, life span, and quality of life are enhanced by innovations in science and technology. Aging individual needs to take advantage of these developments to lead a self-regulated life. However, maintaining a self-regulated life at old age involves a high degree of risk, and the elderly often fail at this goal. Thus, the objective of our study is to investigate the feasibility of implementing a cognitive inference device (CI-device for effective activity supervision in the elderly. To frame the CI-device, we propose a device design framework along with an inference algorithm and implement the designs through an artificial neural model with different configurations, mapping the CI-device’s functions to minimise the device’s prediction error. An analysis and discussion are then provided to validate the feasibility of CI-device implementation for activity supervision in the elderly.
Evolution in Mind: Evolutionary Dynamics, Cognitive Processes, and Bayesian Inference.

Science.gov (United States)

Suchow, Jordan W; Bourgin, David D; Griffiths, Thomas L

2017-07-01

Evolutionary theory describes the dynamics of population change in settings affected by reproduction, selection, mutation, and drift. In the context of human cognition, evolutionary theory is most often invoked to explain the origins of capacities such as language, metacognition, and spatial reasoning, framing them as functional adaptations to an ancestral environment. However, evolutionary theory is useful for understanding the mind in a second way: as a mathematical framework for describing evolving populations of thoughts, ideas, and memories within a single mind. In fact, deep correspondences exist between the mathematics of evolution and of learning, with perhaps the deepest being an equivalence between certain evolutionary dynamics and Bayesian inference. This equivalence permits reinterpretation of evolutionary processes as algorithms for Bayesian inference and has relevance for understanding diverse cognitive capacities, including memory and creativity. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Multiobjective Fuzzy Inference System based Deployment Strategy for a Distributed Mobile Sensor Network

Directory of Open Access Journals (Sweden)

Amol P. Bhondekar

2010-03-01

Full Text Available Sensor deployment scheme highly governs the effectiveness of distributed wireless sensor network. Issues such as energy conservation and clustering make the deployment problem much more complex. A multiobjective Fuzzy Inference System based strategy for mobile sensor deployment is presented in this paper. This strategy gives a synergistic combination of energy capacity, clustering and peer-to-peer deployment. Performance of our strategy is evaluated in terms of coverage, uniformity, speed and clustering. Our algorithm is compared against a modified distributed self-spreading algorithm to exhibit better performance.
Hand based visual intent recognition algorithm for wheelchair motion

CSIR Research Space (South Africa)

Luhandjula, T

2010-05-01

Full Text Available This paper describes an algorithm for a visual human-machine interface that infers a person’s intention from the motion of the hand. Work in progress shows a proof of concept tested on static images. The context for which this solution is intended...
Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

NARCIS (Netherlands)

Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

2009-01-01

Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori
Development Modules for Specification of Requirements for a System of Verification of Parallel Algorithms

Directory of Open Access Journals (Sweden)

Vasiliy Yu. Meltsov

2012-05-01

Full Text Available This paper presents the results of the development of one of the modules of the system verification of parallel algorithms that are used to verify the inference engine. This module is designed to build the specification requirements, the feasibility of which on the algorithm is necessary to prove (test.
Inferring pregnancy episodes and outcomes within a network of observational databases.

Directory of Open Access Journals (Sweden)

Amy Matcho

Full Text Available Administrative claims and electronic health records are valuable resources for evaluating pharmaceutical effects during pregnancy. However, direct measures of gestational age are generally not available. Establishing a reliable approach to infer the duration and outcome of a pregnancy could improve pharmacovigilance activities. We developed and applied an algorithm to define pregnancy episodes in four observational databases: three US-based claims databases: Truven MarketScan® Commercial Claims and Encounters (CCAE, Truven MarketScan® Multi-state Medicaid (MDCD, and the Optum ClinFormatics® (Optum database and one non-US database, the United Kingdom (UK based Clinical Practice Research Datalink (CPRD. Pregnancy outcomes were classified as live births, stillbirths, abortions and ectopic pregnancies. Start dates were estimated using a derived hierarchy of available pregnancy markers, including records such as last menstrual period and nuchal ultrasound dates. Validation included clinical adjudication of 700 electronic Optum and CPRD pregnancy episode profiles to assess the operating characteristics of the algorithm, and a comparison of the algorithm's Optum pregnancy start estimates to starts based on dates of assisted conception procedures. Distributions of pregnancy outcome types were similar across all four data sources and pregnancy episode lengths found were as expected for all outcomes, excepting term lengths in episodes that used amenorrhea and urine pregnancy tests for start estimation. Validation survey results found highest agreement between reviewer chosen and algorithm operating characteristics for questions assessing pregnancy status and accuracy of outcome category with 99-100% agreement for Optum and CPRD. Outcome date agreement within seven days in either direction ranged from 95-100%, while start date agreement within seven days in either direction ranged from 90-97%. In Optum validation sensitivity analysis, a total of 73% of
Sparse linear models: Variational approximate inference and Bayesian experimental design

International Nuclear Information System (INIS)

Seeger, Matthias W

2009-01-01

A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Inferring the gene network underlying the branching of tomato inflorescence.

Directory of Open Access Journals (Sweden)

Laura Astola

Full Text Available The architecture of tomato inflorescence strongly affects flower production and subsequent crop yield. To understand the genetic activities involved, insight into the underlying network of genes that initiate and control the sympodial growth in the tomato is essential. In this paper, we show how the structure of this network can be derived from available data of the expressions of the involved genes. Our approach starts from employing biological expert knowledge to select the most probable gene candidates behind branching behavior. To find how these genes interact, we develop a stepwise procedure for computational inference of the network structure. Our data consists of expression levels from primary shoot meristems, measured at different developmental stages on three different genotypes of tomato. With the network inferred by our algorithm, we can explain the dynamics corresponding to all three genotypes simultaneously, despite their apparent dissimilarities. We also correctly predict the chronological order of expression peaks for the main hubs in the network. Based on the inferred network, using optimal experimental design criteria, we are able to suggest an informative set of experiments for further investigation of the mechanisms underlying branching behavior.
Sparse linear models: Variational approximate inference and Bayesian experimental design

Energy Technology Data Exchange (ETDEWEB)

Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)

2009-12-01

A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.

ITrace: An implicit trust inference method for trust-aware collaborative filtering

Science.gov (United States)

He, Xu; Liu, Bin; Chen, Kejia

2018-04-01

The growth of Internet commerce has stimulated the use of collaborative filtering (CF) algorithms as recommender systems. A CF algorithm recommends items of interest to the target user by leveraging the votes given by other similar users. In a standard CF framework, it is assumed that the credibility of every voting user is exactly the same with respect to the target user. This assumption is not satisfied and thus may lead to misleading recommendations in many practical applications. A natural countermeasure is to design a trust-aware CF (TaCF) algorithm, which can take account of the difference in the credibilities of the voting users when performing CF. To this end, this paper presents a trust inference approach, which can predict the implicit trust of the target user on every voting user from a sparse explicit trust matrix. Then an improved CF algorithm termed iTrace is proposed, which takes advantage of both the explicit and the predicted implicit trust to provide recommendations with the CF framework. An empirical evaluation on a public dataset demonstrates that the proposed algorithm provides a significant improvement in recommendation quality in terms of mean absolute error.
Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of a Medical Decision Support System

Directory of Open Access Journals (Sweden)

Mert Bal

2014-01-01

Full Text Available The importance of the decision support systems is increasingly supporting the decision making process in cases of uncertainty and the lack of information and they are widely used in various fields like engineering, finance, medicine, and so forth, Medical decision support systems help the healthcare personnel to select optimal method during the treatment of the patients. Decision support systems are intelligent software systems that support decision makers on their decisions. The design of decision support systems consists of four main subjects called inference mechanism, knowledge-base, explanation module, and active memory. Inference mechanism constitutes the basis of decision support systems. There are various methods that can be used in these mechanisms approaches. Some of these methods are decision trees, artificial neural networks, statistical methods, rule-based methods, and so forth. In decision support systems, those methods can be used separately or a hybrid system, and also combination of those methods. In this study, synthetic data with 10, 100, 1000, and 2000 records have been produced to reflect the probabilities on the ALARM network. The accuracy of 11 machine learning methods for the inference mechanism of medical decision support system is compared on various data sets.
Performance evaluation of the machine learning algorithms used in inference mechanism of a medical decision support system.

Science.gov (United States)

Bal, Mert; Amasyali, M Fatih; Sever, Hayri; Kose, Guven; Demirhan, Ayse

2014-01-01

The importance of the decision support systems is increasingly supporting the decision making process in cases of uncertainty and the lack of information and they are widely used in various fields like engineering, finance, medicine, and so forth, Medical decision support systems help the healthcare personnel to select optimal method during the treatment of the patients. Decision support systems are intelligent software systems that support decision makers on their decisions. The design of decision support systems consists of four main subjects called inference mechanism, knowledge-base, explanation module, and active memory. Inference mechanism constitutes the basis of decision support systems. There are various methods that can be used in these mechanisms approaches. Some of these methods are decision trees, artificial neural networks, statistical methods, rule-based methods, and so forth. In decision support systems, those methods can be used separately or a hybrid system, and also combination of those methods. In this study, synthetic data with 10, 100, 1000, and 2000 records have been produced to reflect the probabilities on the ALARM network. The accuracy of 11 machine learning methods for the inference mechanism of medical decision support system is compared on various data sets.
Detection of multiple damages employing best achievable eigenvectors under Bayesian inference

Science.gov (United States)

Prajapat, Kanta; Ray-Chaudhuri, Samit

2018-05-01

A novel approach is presented in this work to localize simultaneously multiple damaged elements in a structure along with the estimation of damage severity for each of the damaged elements. For detection of damaged elements, a best achievable eigenvector based formulation has been derived. To deal with noisy data, Bayesian inference is employed in the formulation wherein the likelihood of the Bayesian algorithm is formed on the basis of errors between the best achievable eigenvectors and the measured modes. In this approach, the most probable damage locations are evaluated under Bayesian inference by generating combinations of various possible damaged elements. Once damage locations are identified, damage severities are estimated using a Bayesian inference Markov chain Monte Carlo simulation. The efficiency of the proposed approach has been demonstrated by carrying out a numerical study involving a 12-story shear building. It has been found from this study that damage scenarios involving as low as 10% loss of stiffness in multiple elements are accurately determined (localized and severities quantified) even when 2% noise contaminated modal data are utilized. Further, this study introduces a term parameter impact (evaluated based on sensitivity of modal parameters towards structural parameters) to decide the suitability of selecting a particular mode, if some idea about the damaged elements are available. It has been demonstrated here that the accuracy and efficiency of the Bayesian quantification algorithm increases if damage localization is carried out a-priori. An experimental study involving a laboratory scale shear building and different stiffness modification scenarios shows that the proposed approach is efficient enough to localize the stories with stiffness modification.
Bayesian pedigree inference with small numbers of single nucleotide polymorphisms via a factor-graph representation.

Science.gov (United States)

Anderson, Eric C; Ng, Thomas C

2016-02-01

We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Evaluation of a new neutron energy spectrum unfolding code based on an Adaptive Neuro-Fuzzy Inference System (ANFIS).

Science.gov (United States)

Hosseini, Seyed Abolfazl; Esmaili Paeen Afrakoti, Iman

2018-01-17

The purpose of the present study was to reconstruct the energy spectrum of a poly-energetic neutron source using an algorithm developed based on an Adaptive Neuro-Fuzzy Inference System (ANFIS). ANFIS is a kind of artificial neural network based on the Takagi-Sugeno fuzzy inference system. The ANFIS algorithm uses the advantages of both fuzzy inference systems and artificial neural networks to improve the effectiveness of algorithms in various applications such as modeling, control and classification. The neutron pulse height distributions used as input data in the training procedure for the ANFIS algorithm were obtained from the simulations performed by MCNPX-ESUT computational code (MCNPX-Energy engineering of Sharif University of Technology). Taking into account the normalization condition of each energy spectrum, 4300 neutron energy spectra were generated randomly. (The value in each bin was generated randomly, and finally a normalization of each generated energy spectrum was performed). The randomly generated neutron energy spectra were considered as output data of the developed ANFIS computational code in the training step. To calculate the neutron energy spectrum using conventional methods, an inverse problem with an approximately singular response matrix (with the determinant of the matrix close to zero) should be solved. The solution of the inverse problem using the conventional methods unfold neutron energy spectrum with low accuracy. Application of the iterative algorithms in the solution of such a problem, or utilizing the intelligent algorithms (in which there is no need to solve the problem), is usually preferred for unfolding of the energy spectrum. Therefore, the main reason for development of intelligent algorithms like ANFIS for unfolding of neutron energy spectra is to avoid solving the inverse problem. In the present study, the unfolded neutron energy spectra of 252Cf and 241Am-9Be neutron sources using the developed computational code were
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

Science.gov (United States)

Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

2015-09-29

In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
STUDENT PREDICTION SYSTEM FOR PLACEMENT TRAINING USING FUZZY INFERENCE SYSTEM

Directory of Open Access Journals (Sweden)

Ravi Kumar Rathore

2017-04-01

Full Text Available Proposed student prediction system is most vital approach which may be used to differentiate the student data/information on the basis of the student performance. Managing placement and training records in any larger organization is quite difficult as the student number are high; in such condition differentiation and classification on different categories becomes tedious. Proposed fuzzy inference system will classify the student data with ease and will be helpful to many educational organizations. There are lots of classification algorithms and statistical base technique which may be taken as good assets for classify the student data set in the education field. In this paper, Fuzzy Inference system has been applied to predict student performance which will help to identify performance of the students and also provides an opportunity to improve to performance. For instance, here we will classify the student’s data set for placement and non-placement classes.
Niffler: A Context-Aware and User-Independent Side-Channel Attack System for Password Inference

Directory of Open Access Journals (Sweden)

Benxiao Tang

2018-01-01

Full Text Available Digital password lock has been commonly used on mobile devices as the primary authentication method. Researches have demonstrated that sensors embedded on mobile devices can be employed to infer the password. However, existing works focus on either each single keystroke inference or entire password sequence inference, which are user-dependent and require huge efforts to collect the ground truth training data. In this paper, we design a novel side-channel attack system, called Niffler, which leverages the user-independent features of movements of tapping consecutive buttons to infer unlocking passwords on smartphones. We extract angle features to reflect the changing trends and build a multicategory classifier combining the dynamic time warping algorithm to infer the probability of each movement. We further use the Markov model to model the unlocking process and use the sequences with the highest probabilities as the attack candidates. Moreover, the sensor readings of successful attacks will be further fed back to continually improve the accuracy of the classifier. In our experiments, 100,000 samples collected from 25 participants are used to evaluate the performance of Niffler. The results show that Niffler achieves 70% and 85% accuracy with 10 attempts in user-independent and user-dependent environments with few training samples, respectively.
Entropic Inference

Science.gov (United States)

Caticha, Ariel

2011-03-01

In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics.

Science.gov (United States)

Helaers, Raphaël; Milinkovitch, Michel C

2010-07-15

The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high
Inference for feature selection using the Lasso with high-dimensional data

DEFF Research Database (Denmark)

Brink-Jensen, Kasper; Ekstrøm, Claus Thorn

2014-01-01

Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These methods identify and rank variables of importance but do...... not generally provide any inference of the selected variables. Thus, the variables selected might be the "most important" but need not be significant. We propose a significance test for the selection found by the Lasso. We introduce a procedure that computes inference and p-values for features chosen...... by the Lasso. This method rephrases the null hypothesis and uses a randomization approach which ensures that the error rate is controlled even for small samples. We demonstrate the ability of the algorithm to compute $p$-values of the expected magnitude with simulated data using a multitude of scenarios...
Characterization and in vitro evaluation of freeze-dried microparticles composed of granisetron-cyclodextrin complex and carboxymethylcellulose for intranasal delivery.

Science.gov (United States)

Cho, Hyun-Jong; Balakrishnan, Prabagar; Shim, Won-Sik; Chung, Suk-Jae; Shim, Chang-Koo; Kim, Dae-Duk

2010-11-15

The aim of this study was to prepare microparticles (MPs) of granisetron (GRN) in combination with hydroxypropyl-β-cyclodextrin (HP-β-CD) and sodium carboxymethylcellulose (CMC-Na) by the simple freeze-drying method for intranasal delivery. The composition of MPs was determined from the phase-solubility study of GRN in various CDs. Fourier transform infrared spectroscopy (FT-IR), powder X-ray diffraction (PXRD) analysis and differential scanning calorimetry (DSC) studies were performed to evaluate possible interactions between GRN and excipients. The results indicated the formation of inclusion complex between GRN and CD, and the conversion of drug into amorphous state. The in vitro release of GRN from MPs was determined in phosphate buffered saline (pH 6.4) at 37°C. Cytotoxicity of the MPs and in vitro permeation study were conducted by using primary human nasal epithelial (HNE) cells and their monolayer system cultured by air-liquid interface (ALI) method, respectively. The MPs showed significantly higher GRN release profile compared to pure GRN. Moreover, the prepared MPs showed significantly lower cytotoxicity and higher permeation profile than that of GRN powder (p<0.05). These results suggested that the MPs composed of GRN, HP-β-CD and CMC-Na represent a simple and new GRN intranasal delivery system as an alternative to the oral and intravenous administration of GRN. Copyright © 2010 Elsevier B.V. All rights reserved.
Bayesian nonparametric generative models for causal inference with missing at random covariates.

Science.gov (United States)

Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J

2018-03-26

We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.
Long-time analytic approximation of large stochastic oscillators: Simulation, analysis and inference.

Directory of Open Access Journals (Sweden)

Giorgos Minas

2017-07-01

Full Text Available In order to analyse large complex stochastic dynamical models such as those studied in systems biology there is currently a great need for both analytical tools and also algorithms for accurate and fast simulation and estimation. We present a new stochastic approximation of biological oscillators that addresses these needs. Our method, called phase-corrected LNA (pcLNA overcomes the main limitations of the standard Linear Noise Approximation (LNA to remain uniformly accurate for long times, still maintaining the speed and analytically tractability of the LNA. As part of this, we develop analytical expressions for key probability distributions and associated quantities, such as the Fisher Information Matrix and Kullback-Leibler divergence and we introduce a new approach to system-global sensitivity analysis. We also present algorithms for statistical inference and for long-term simulation of oscillating systems that are shown to be as accurate but much faster than leaping algorithms and algorithms for integration of diffusion equations. Stochastic versions of published models of the circadian clock and NF-κB system are used to illustrate our results.
More than one kind of inference: re-examining what's learned in feature inference and classification.

Science.gov (United States)

Sweller, Naomi; Hayes, Brett K

2010-08-01

Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
A unified framework for haplotype inference in nuclear families.

Science.gov (United States)

Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong

2012-07-01

Many large genome-wide association studies include nuclear families with more than one child (trio families), allowing for analysis of differences between siblings (sib pair analysis). Statistical power can be increased when haplotypes are used instead of genotypes. Currently, haplotype inference in families with more than one child can be performed either using the familial information or statistical information derived from the population samples but not both. Building on our recently proposed tree-based deterministic framework (TDS) for trio families, we augment its applicability to general nuclear families. We impose a minimum recombinant approach locally and independently on each multiple children family, while resorting to the population-derived information to solve the remaining ambiguities. Thus our framework incorporates all available information (familial and population) in a given study. We demonstrate that using all the constraints in our approach we can have gains in the accuracy as opposed to breaking the multiple children families to separate trios and resorting to a trio inference algorithm or phasing each family in isolation. We believe that our proposed framework could be the method of choice for haplotype inference in studies that include nuclear families with multiple children. Our software (tds2.0) is downloadable from www.ee.columbia.edu/∼anastas/tds. © 2012 The Authors Annals of Human Genetics © 2012 Blackwell Publishing Ltd/University College London.
Resource-Aware Data Fusion Algorithms for Wireless Sensor Networks

CERN Document Server

Abdelgawad, Ahmed

2012-01-01

This book introduces resource-aware data fusion algorithms to gather and combine data from multiple sources (e.g., sensors) in order to achieve inferences. These techniques can be used in centralized and distributed systems to overcome sensor failure, technological limitation, and spatial and temporal coverage problems. The algorithms described in this book are evaluated with simulation and experimental results to show they will maintain data integrity and make data useful and informative. Describes techniques to overcome real problems posed by wireless sensor networks deployed in circumstances that might interfere with measurements provided, such as strong variations of pressure, temperature, radiation, and electromagnetic noise; Uses simulation and experimental results to evaluate algorithms presented and includes real test-bed; Includes case study implementing data fusion algorithms on a remote monitoring framework for sand production in oil pipelines.
A human genome-wide library of local phylogeny predictions for whole-genome inference problems

Directory of Open Access Journals (Sweden)

Schwartz Russell

2008-08-01

Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.
Perceptual inference.

Science.gov (United States)

Aggelopoulos, Nikolaos C

2015-08-01

Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.

An improved recommended algorithm for network structure based on two partial graphs

Directory of Open Access Journals (Sweden)

Deng Song

2017-08-01

Full Text Available In this thesis,we introduce an improved algorithm based on network structure.Based on the standard material diffusion algorithm,considering the influence of the user's score on the recommendation,the adjustment factor of the initial resource allocation vector and the resource transfer matrix in the recommendation algorithm is improved.Using the practical data set from GroupLens webite to evaluate the performance of the proposed algorithm,we performed a series of experiments.The experimental results reveal that it can yield better recommendation accuracy and has higher hitting rate than collaborative filtering,network-based inference.It can solve the problem of cold start and scalability in the standard material diffusion algorithm.And it also can make the recommendation results diversified.
Bayesian inference on EMRI signals using low frequency approximations

International Nuclear Information System (INIS)

Ali, Asad; Meyer, Renate; Christensen, Nelson; Röver, Christian

2012-01-01

Extreme mass ratio inspirals (EMRIs) are thought to be one of the most exciting gravitational wave sources to be detected with LISA. Due to their complicated nature and weak amplitudes the detection and parameter estimation of such sources is a challenging task. In this paper we present a statistical methodology based on Bayesian inference in which the estimation of parameters is carried out by advanced Markov chain Monte Carlo (MCMC) algorithms such as parallel tempering MCMC. We analysed high and medium mass EMRI systems that fall well inside the low frequency range of LISA. In the context of the Mock LISA Data Challenges, our investigation and results are also the first instance in which a fully Markovian algorithm is applied for EMRI searches. Results show that our algorithm worked well in recovering EMRI signals from different (simulated) LISA data sets having single and multiple EMRI sources and holds great promise for posterior computation under more realistic conditions. The search and estimation methods presented in this paper are general in their nature, and can be applied in any other scenario such as AdLIGO, AdVIRGO and Einstein Telescope with their respective response functions. (paper)
SEMANTIC PATCH INFERENCE

DEFF Research Database (Denmark)

Andersen, Jesper

2009-01-01

Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....
Inferring pregnancy episodes and outcomes within a network of observational databases

Science.gov (United States)

Ryan, Patrick; Fife, Daniel; Gifkins, Dina; Knoll, Chris; Friedman, Andrew

2018-01-01

Administrative claims and electronic health records are valuable resources for evaluating pharmaceutical effects during pregnancy. However, direct measures of gestational age are generally not available. Establishing a reliable approach to infer the duration and outcome of a pregnancy could improve pharmacovigilance activities. We developed and applied an algorithm to define pregnancy episodes in four observational databases: three US-based claims databases: Truven MarketScan® Commercial Claims and Encounters (CCAE), Truven MarketScan® Multi-state Medicaid (MDCD), and the Optum ClinFormatics® (Optum) database and one non-US database, the United Kingdom (UK) based Clinical Practice Research Datalink (CPRD). Pregnancy outcomes were classified as live births, stillbirths, abortions and ectopic pregnancies. Start dates were estimated using a derived hierarchy of available pregnancy markers, including records such as last menstrual period and nuchal ultrasound dates. Validation included clinical adjudication of 700 electronic Optum and CPRD pregnancy episode profiles to assess the operating characteristics of the algorithm, and a comparison of the algorithm’s Optum pregnancy start estimates to starts based on dates of assisted conception procedures. Distributions of pregnancy outcome types were similar across all four data sources and pregnancy episode lengths found were as expected for all outcomes, excepting term lengths in episodes that used amenorrhea and urine pregnancy tests for start estimation. Validation survey results found highest agreement between reviewer chosen and algorithm operating characteristics for questions assessing pregnancy status and accuracy of outcome category with 99–100% agreement for Optum and CPRD. Outcome date agreement within seven days in either direction ranged from 95–100%, while start date agreement within seven days in either direction ranged from 90–97%. In Optum validation sensitivity analysis, a total of 73% of
A Fuzzy Gravitational Search Algorithm to Design Optimal IIR Filters

Directory of Open Access Journals (Sweden)

Danilo Pelusi

2018-03-01

Full Text Available The goodness of Infinite Impulse Response (IIR digital filters design depends on pass band ripple, stop band ripple and transition band values. The main problem is defining a suitable error fitness function that depends on these parameters. This fitness function can be optimized by search algorithms such as evolutionary algorithms. This paper proposes an intelligent algorithm for the design of optimal 8th order IIR filters. The main contribution is the design of Fuzzy Inference Systems able to tune key parameters of a revisited version of the Gravitational Search Algorithm (GSA. In this way, a Fuzzy Gravitational Search Algorithm (FGSA is designed. The optimization performances of FGSA are compared with those of Differential Evolution (DE and GSA. The results show that FGSA is the algorithm that gives the best compromise between goodness, robustness and convergence rate for the design of 8th order IIR filters. Moreover, FGSA assures a good stability of the designed filters.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

Science.gov (United States)

Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

2000-01-01

Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Qualitative reasoning for biological network inference from systematic perturbation experiments.

Science.gov (United States)

Badaloni, Silvana; Di Camillo, Barbara; Sambo, Francesco

2012-01-01

The systematic perturbation of the components of a biological system has been proven among the most informative experimental setups for the identification of causal relations between the components. In this paper, we present Systematic Perturbation-Qualitative Reasoning (SPQR), a novel Qualitative Reasoning approach to automate the interpretation of the results of systematic perturbation experiments. Our method is based on a qualitative abstraction of the experimental data: for each perturbation experiment, measured values of the observed variables are modeled as lower, equal or higher than the measurements in the wild type condition, when no perturbation is applied. The algorithm exploits a set of IF-THEN rules to infer causal relations between the variables, analyzing the patterns of propagation of the perturbation signals through the biological network, and is specifically designed to minimize the rate of false positives among the inferred relations. Tested on both simulated and real perturbation data, SPQR indeed exhibits a significantly higher precision than the state of the art.
Multimodel inference and adaptive management

Science.gov (United States)

Rehme, S.E.; Powell, L.A.; Allen, Craig R.

2011-01-01

Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.
Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

Science.gov (United States)

Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S

2015-02-01

With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.
Bayesian Estimation and Inference using Stochastic Hardware

Directory of Open Access Journals (Sweden)

Chetan Singh Thakur

2016-03-01

Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Estimation and Inference Using Stochastic Electronics.

Science.gov (United States)

Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André

2016-01-01

In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Approximation Of Multi-Valued Inverse Functions Using Clustering And Sugeno Fuzzy Inference

Science.gov (United States)

Walden, Maria A.; Bikdash, Marwan; Homaifar, Abdollah

1998-01-01

Finding the inverse of a continuous function can be challenging and computationally expensive when the inverse function is multi-valued. Difficulties may be compounded when the function itself is difficult to evaluate. We show that we can use fuzzy-logic approximators such as Sugeno inference systems to compute the inverse on-line. To do so, a fuzzy clustering algorithm can be used in conjunction with a discriminating function to split the function data into branches for the different values of the forward function. These data sets are then fed into a recursive least-squares learning algorithm that finds the proper coefficients of the Sugeno approximators; each Sugeno approximator finds one value of the inverse function. Discussions about the accuracy of the approximation will be included.
Inferring topologies of complex networks with hidden variables.

Science.gov (United States)

Wu, Xiaoqun; Wang, Weihan; Zheng, Wei Xing

2012-10-01

Network topology plays a crucial role in determining a network's intrinsic dynamics and function, thus understanding and modeling the topology of a complex network will lead to greater knowledge of its evolutionary mechanisms and to a better understanding of its behaviors. In the past few years, topology identification of complex networks has received increasing interest and wide attention. Many approaches have been developed for this purpose, including synchronization-based identification, information-theoretic methods, and intelligent optimization algorithms. However, inferring interaction patterns from observed dynamical time series is still challenging, especially in the absence of knowledge of nodal dynamics and in the presence of system noise. The purpose of this work is to present a simple and efficient approach to inferring the topologies of such complex networks. The proposed approach is called "piecewise partial Granger causality." It measures the cause-effect connections of nonlinear time series influenced by hidden variables. One commonly used testing network, two regular networks with a few additional links, and small-world networks are used to evaluate the performance and illustrate the influence of network parameters on the proposed approach. Application to experimental data further demonstrates the validity and robustness of our method.
Ab initio Algorithmic Causal Deconvolution of Intertwined Programs and Networks by Generative Mechanism

KAUST Repository

Zenil, Hector

2018-02-18

To extract and learn representations leading to generative mechanisms from data, especially without making arbitrary decisions and biased assumptions, is a central challenge in most areas of scientific research particularly in connection to current major limitations of influential topics and methods of machine and deep learning as they have often lost sight of the model component. Complex data is usually produced by interacting sources with different mechanisms. Here we introduce a parameter-free model-based approach, based upon the seminal concept of Algorithmic Probability, that decomposes an observation and signal into its most likely algorithmic generative mechanisms. Our methods use a causal calculus to infer model representations. We demonstrate the method ability to distinguish interacting mechanisms and deconvolve them, regardless of whether the objects produce strings, space-time evolution diagrams, images or networks. We numerically test and evaluate our method and find that it can disentangle observations from discrete dynamic systems, random and complex networks. We think that these causal inference techniques can contribute as key pieces of information for estimations of probability distributions complementing other more statistical-oriented techniques that otherwise lack model inference capabilities.
Optimal inference with suboptimal models: Addiction and active Bayesian inference

Science.gov (United States)

Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

2015-01-01

When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
DESIGNING ALGORITHMS FOR SERVICE ROBOTS ON THE BASIS OF MIVAR APPROACH

Directory of Open Access Journals (Sweden)

Alexey Andreevich Panferov

2017-05-01

Full Text Available Opportunities of mivar-based approach for robots have been analyzed. Mivar-based method of rapid logical inference for calculating random algorithms of service robot functioning has been tested successfully. The logical model of office robot-guide functioning with the application of mivar-based method of rapid logical inference in the software environment “KESMI” (Wi!Mi 1.1 has been developed. Formalized map of the office for service robot has been described in mivar matrix, 63 objects for 100 rules. Simulation of robot functioning in the software environment V-REP has been performed.
Inference of miRNA targets using evolutionary conservation and pathway analysis

Directory of Open Access Journals (Sweden)

van Nimwegen Erik

2007-03-01

Full Text Available Abstract Background MicroRNAs have emerged as important regulatory genes in a variety of cellular processes and, in recent years, hundreds of such genes have been discovered in animals. In contrast, functional annotations are available only for a very small fraction of these miRNAs, and even in these cases only partially. Results We developed a general Bayesian method for the inference of miRNA target sites, in which, for each miRNA, we explicitly model the evolution of orthologous target sites in a set of related species. Using this method we predict target sites for all known miRNAs in flies, worms, fish, and mammals. By comparing our predictions in fly with a reference set of experimentally tested miRNA-mRNA interactions we show that our general method performs at least as well as the most accurate methods available to date, including ones specifically tailored for target prediction in fly. An important novel feature of our model is that it explicitly infers the phylogenetic distribution of functional target sites, independently for each miRNA. This allows us to infer species-specific and clade-specific miRNA targeting. We also show that, in long human 3' UTRs, miRNA target sites occur preferentially near the start and near the end of the 3' UTR. To characterize miRNA function beyond the predicted lists of targets we further present a method to infer significant associations between the sets of targets predicted for individual miRNAs and specific biochemical pathways, in particular those of the KEGG pathway database. We show that this approach retrieves several known functional miRNA-mRNA associations, and predicts novel functions for known miRNAs in cell growth and in development. Conclusion We have presented a Bayesian target prediction algorithm without any tunable parameters, that can be applied to sequences from any clade of species. The algorithm automatically infers the phylogenetic distribution of functional sites for each miRNA, and
Inference rule and problem solving

Energy Technology Data Exchange (ETDEWEB)

Goto, S

1982-04-01

Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.
Statistical inference approach to structural reconstruction of complex networks from binary time series

Science.gov (United States)

Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

2018-02-01

Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
A Bayesian inference approach to unveil supply curves in electricity markets

DEFF Research Database (Denmark)

Mitridati, Lesia Marie-Jeanne Mariane; Pinson, Pierre

2017-01-01

in the literature on modeling this uncertainty. In this study we introduce a Bayesian inference approach to reveal the aggregate supply curve in a day-ahead electricity market. The proposed algorithm relies on Markov Chain Monte Carlo and Sequential Monte Carlo methods. The major appeal of this approach......With increased competition in wholesale electricity markets, the need for new decision-making tools for strategic producers has arisen. Optimal bidding strategies have traditionally been modeled as stochastic profit maximization problems. However, for producers with non-negligible market power...

MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics

Directory of Open Access Journals (Sweden)

Milinkovitch Michel C

2010-07-01

Full Text Available Abstract Background The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Results Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood, including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. Conclusions The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these
Evolutionary rates at codon sites may be used to align sequences and infer protein domain function

Directory of Open Access Journals (Sweden)

Hazelhurst Scott

2010-03-01

Full Text Available Abstract Background Sequence alignments form part of many investigations in molecular biology, including the determination of phylogenetic relationships, the prediction of protein structure and function, and the measurement of evolutionary rates. However, to obtain meaningful results, a significant degree of sequence similarity is required to ensure that the alignments are accurate and the inferences correct. Limitations arise when sequence similarity is low, which is particularly problematic when working with fast-evolving genes, evolutionary distant taxa, genomes with nucleotide biases, and cases of convergent evolution. Results A novel approach was conceptualized to address the "low sequence similarity" alignment problem. We developed an alignment algorithm termed FIRE (Functional Inference using the Rates of Evolution, which aligns sequences using the evolutionary rate at codon sites, as measured by the dN/dS ratio, rather than nucleotide or amino acid residues. FIRE was used to test the hypotheses that evolutionary rates can be used to align sequences and that the alignments may be used to infer protein domain function. Using a range of test data, we found that aligning domains based on evolutionary rates was possible even when sequence similarity was very low (for example, antibody variable regions. Furthermore, the alignment has the potential to infer protein domain function, indicating that domains with similar functions are subject to similar evolutionary constraints. These data suggest that an evolutionary rate-based approach to sequence analysis (particularly when combined with structural data may be used to study cases of convergent evolution or when sequences have very low similarity. However, when aligning homologous gene sets with sequence similarity, FIRE did not perform as well as the best traditional alignment algorithms indicating that the conventional approach of aligning residues as opposed to evolutionary rates remains the
Knowledge and inference

CERN Document Server

Nagao, Makoto

1990-01-01

Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig
Hybrid fuzzy charged system search algorithm based state estimation in distribution networks

Directory of Open Access Journals (Sweden)

Sachidananda Prasad

2017-06-01

Full Text Available This paper proposes a new hybrid charged system search (CSS algorithm based state estimation in radial distribution networks in fuzzy framework. The objective of the optimization problem is to minimize the weighted square of the difference between the measured and the estimated quantity. The proposed method of state estimation considers bus voltage magnitude and phase angle as state variable along with some equality and inequality constraints for state estimation in distribution networks. A rule based fuzzy inference system has been designed to control the parameters of the CSS algorithm to achieve better balance between the exploration and exploitation capability of the algorithm. The efficiency of the proposed fuzzy adaptive charged system search (FACSS algorithm has been tested on standard IEEE 33-bus system and Indian 85-bus practical radial distribution system. The obtained results have been compared with the conventional CSS algorithm, weighted least square (WLS algorithm and particle swarm optimization (PSO for feasibility of the algorithm.
Geometric statistical inference

International Nuclear Information System (INIS)

Periwal, Vipul

1999-01-01

A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Model-free inference of direct network interactions from nonlinear collective dynamics.

Science.gov (United States)

Casadiego, Jose; Nitzan, Mor; Hallerberg, Sarah; Timme, Marc

2017-12-19

The topology of interactions in network dynamical systems fundamentally underlies their function. Accelerating technological progress creates massively available data about collective nonlinear dynamics in physical, biological, and technological systems. Detecting direct interaction patterns from those dynamics still constitutes a major open problem. In particular, current nonlinear dynamics approaches mostly require to know a priori a model of the (often high dimensional) system dynamics. Here we develop a model-independent framework for inferring direct interactions solely from recording the nonlinear collective dynamics generated. Introducing an explicit dependency matrix in combination with a block-orthogonal regression algorithm, the approach works reliably across many dynamical regimes, including transient dynamics toward steady states, periodic and non-periodic dynamics, and chaos. Together with its capabilities to reveal network (two point) as well as hypernetwork (e.g., three point) interactions, this framework may thus open up nonlinear dynamics options of inferring direct interaction patterns across systems where no model is known.
438 Adaptive Kernel in Meshsize Boosting Algorithm in KDE (Pp ...

African Journals Online (AJOL)

FIRST LADY

2011-01-18

Jan 18, 2011 ... Birke, Melanie (2009). “Shape constrained KDE.” Journal of Statistical. Planning & Inference, vol 139, issue 8 , August 2009, pg 2851 –. 2862. Duffy, N. and Hemlbold, D. (2000). “Potential bosters? Advances in Neural info.” Proc. Sys. 12, 258 – 264. Freund, Y. (1995). “Boosting a Weak Learning Algorithm ...
Using adaptive network based fuzzy inference system to forecast regional electricity loads

International Nuclear Information System (INIS)

Ying, L.-C.; Pan, M.-C.

2008-01-01

Since accurate regional load forecasting is very important for improvement of the management performance of the electric industry, various regional load forecasting methods have been developed. The purpose of this study is to apply the adaptive network based fuzzy inference system (ANFIS) model to forecast the regional electricity loads in Taiwan and demonstrate the forecasting performance of this model. Based on the mean absolute percentage errors and statistical results, we can see that the ANFIS model has better forecasting performance than the regression model, artificial neural network (ANN) model, support vector machines with genetic algorithms (SVMG) model, recurrent support vector machines with genetic algorithms (RSVMG) model and hybrid ellipsoidal fuzzy systems for time series forecasting (HEFST) model. Thus, the ANFIS model is a promising alternative for forecasting regional electricity loads
Using adaptive network based fuzzy inference system to forecast regional electricity loads

Energy Technology Data Exchange (ETDEWEB)

Ying, Li-Chih [Department of Marketing Management, Central Taiwan University of Science and Technology, 11, Pu-tzu Lane, Peitun, Taichung City 406 (China); Pan, Mei-Chiu [Graduate Institute of Management Sciences, Nanhua University, 32, Chung Keng Li, Dalin, Chiayi 622 (China)

2008-02-15

Since accurate regional load forecasting is very important for improvement of the management performance of the electric industry, various regional load forecasting methods have been developed. The purpose of this study is to apply the adaptive network based fuzzy inference system (ANFIS) model to forecast the regional electricity loads in Taiwan and demonstrate the forecasting performance of this model. Based on the mean absolute percentage errors and statistical results, we can see that the ANFIS model has better forecasting performance than the regression model, artificial neural network (ANN) model, support vector machines with genetic algorithms (SVMG) model, recurrent support vector machines with genetic algorithms (RSVMG) model and hybrid ellipsoidal fuzzy systems for time series forecasting (HEFST) model. Thus, the ANFIS model is a promising alternative for forecasting regional electricity loads. (author)
First order augmentation to tensor voting for boundary inference and multiscale analysis in 3D.

Science.gov (United States)

Tong, Wai-Shun; Tang, Chi-Keung; Mordohai, Philippos; Medioni, Gérard

2004-05-01

Most computer vision applications require the reliable detection of boundaries. In the presence of outliers, missing data, orientation discontinuities, and occlusion, this problem is particularly challenging. We propose to address it by complementing the tensor voting framework, which was limited to second order properties, with first order representation and voting. First order voting fields and a mechanism to vote for 3D surface and volume boundaries and curve endpoints in 3D are defined. Boundary inference is also useful for a second difficult problem in grouping, namely, automatic scale selection. We propose an algorithm that automatically infers the smallest scale that can preserve the finest details. Our algorithm then proceeds with progressively larger scales to ensure continuity where it has not been achieved. Therefore, the proposed approach does not oversmooth features or delay the handling of boundaries and discontinuities until model misfit occurs. The interaction of smooth features, boundaries, and outliers is accommodated by the unified representation, making possible the perceptual organization of data in curves, surfaces, volumes, and their boundaries simultaneously. We present results on a variety of data sets to show the efficacy of the improved formalism.
Bayesian inference and decision theory - A framework for decision making in natural resource management

Science.gov (United States)

Dorazio, R.M.; Johnson, F.A.

2003-01-01

Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Extending Dylan's type system for better type inference and error detection

DEFF Research Database (Denmark)

Mehnert, Hannes

2010-01-01

a dynamically typed language. Dylan poses several special challenges for gradual typing, such as multiple return values, variable-arity methods and generic functions (multiple dispatch). In this paper Dylan is extended with function types and parametric polymorphism. We implemented the type system...... and aunification-based type inference algorithm in the mainstream Dylan compiler. As case study we use the Dylan standard library (roughly 32000 lines of code), which witnesses that the implementation generates faster code with fewer errors. Some previously undiscovered errors in the Dylan library were revealed....
Goal inferences about robot behavior : goal inferences and human response behaviors

NARCIS (Netherlands)

Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.

2014-01-01

This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.
Network inference via adaptive optimal design

Directory of Open Access Journals (Sweden)

Stigter Johannes D

2012-09-01

Full Text Available Abstract Background Current research in network reverse engineering for genetic or metabolic networks very often does not include a proper experimental and/or input design. In this paper we address this issue in more detail and suggest a method that includes an iterative design of experiments based, on the most recent data that become available. The presented approach allows a reliable reconstruction of the network and addresses an important issue, i.e., the analysis and the propagation of uncertainties as they exist in both the data and in our own knowledge. These two types of uncertainties have their immediate ramifications for the uncertainties in the parameter estimates and, hence, are taken into account from the very beginning of our experimental design. Findings The method is demonstrated for two small networks that include a genetic network for mRNA synthesis and degradation and an oscillatory network describing a molecular network underlying adenosine 3’-5’ cyclic monophosphate (cAMP as observed in populations of Dyctyostelium cells. In both cases a substantial reduction in parameter uncertainty was observed. Extension to larger scale networks is possible but needs a more rigorous parameter estimation algorithm that includes sparsity as a constraint in the optimization procedure. Conclusion We conclude that a careful experiment design very often (but not always pays off in terms of reliability in the inferred network topology. For large scale networks a better parameter estimation algorithm is required that includes sparsity as an additional constraint. These algorithms are available in the literature and can also be used in an adaptive optimal design setting as demonstrated in this paper.
Predictive Distribution of the Dirichlet Mixture Model by the Local Variational Inference Method

DEFF Research Database (Denmark)

Ma, Zhanyu; Leijon, Arne; Tan, Zheng-Hua

2014-01-01

the predictive likelihood of the new upcoming data, especially when the amount of training data is small. The Bayesian estimation of a Dirichlet mixture model (DMM) is, in general, not analytically tractable. In our previous work, we have proposed a global variational inference-based method for approximately...... calculating the posterior distributions of the parameters in the DMM analytically. In this paper, we extend our previous study for the DMM and propose an algorithm to calculate the predictive distribution of the DMM with the local variational inference (LVI) method. The true predictive distribution of the DMM...... is analytically intractable. By considering the concave property of the multivariate inverse beta function, we introduce an upper-bound to the true predictive distribution. As the global minimum of this upper-bound exists, the problem is reduced to seek an approximation to the true predictive distribution...
A novel lipid nanoemulsion system for improved permeation of granisetron.

Science.gov (United States)

Doh, Hea-Jeong; Jung, Yunjin; Balakrishnan, Prabagar; Cho, Hyun-Jong; Kim, Dae-Duk

2013-01-01

A new lipid nanoemulsion (LNE) system containing granisetron (GRN) was developed and its in vitro permeation-enhancing effect was evaluated using Caco-2 cell monolayers. Particle size, polydispersity index (PI) and stability of the prepared GRN-loaded LNE systems were also characterized. The mean diameters of prepared LNEs were around 50 nm with PI<0.2. Developed LNEs were stable at 4°C in the dark place over a period of 12 weeks. In vitro drug dissolution and cytotoxicity studies of GRN-loaded LNEs were performed. GRN-loaded LNEs exhibited significantly higher drug dissolution than GRN suspension at pH 6.8 for 2h (P<0.05). In vitro permeation study in Caco-2 cell monolayers showed that the LNEs significantly enhanced the drug permeation compared to GRN powder. The in vivo toxicity study in the rat jejunum revealed that the prepared GRN-loaded LNE was as safe as the commercial formulation (Kytril). These results suggest that LNE could be used as a potential oral liquid formulation of GRN for anti-emetic treatment on the post-operative and chemotherapeutic patients. Copyright © 2012 Elsevier B.V. All rights reserved.
Extreme-Scale Bayesian Inference for Uncertainty Quantification of Complex Simulations

Energy Technology Data Exchange (ETDEWEB)

Biros, George [Univ. of Texas, Austin, TX (United States)

2018-01-12

Uncertainty quantification (UQ)—that is, quantifying uncertainties in complex mathematical models and their large-scale computational implementations—is widely viewed as one of the outstanding challenges facing the field of CS&E over the coming decade. The EUREKA project set to address the most difficult class of UQ problems: those for which both the underlying PDE model as well as the uncertain parameters are of extreme scale. In the project we worked on these extreme-scale challenges in the following four areas: 1. Scalable parallel algorithms for sampling and characterizing the posterior distribution that exploit the structure of the underlying PDEs and parameter-to-observable map. These include structure-exploiting versions of the randomized maximum likelihood method, which aims to overcome the intractability of employing conventional MCMC methods for solving extreme-scale Bayesian inversion problems by appealing to and adapting ideas from large-scale PDE-constrained optimization, which have been very successful at exploring high-dimensional spaces. 2. Scalable parallel algorithms for construction of prior and likelihood functions based on learning methods and non-parametric density estimation. Constructing problem-specific priors remains a critical challenge in Bayesian inference, and more so in high dimensions. Another challenge is construction of likelihood functions that capture unmodeled couplings between observations and parameters. We will create parallel algorithms for non-parametric density estimation using high dimensional N-body methods and combine them with supervised learning techniques for the construction of priors and likelihood functions. 3. Bayesian inadequacy models, which augment physics models with stochastic models that represent their imperfections. The success of the Bayesian inference framework depends on the ability to represent the uncertainty due to imperfections of the mathematical model of the phenomena of interest. This is a
Entropic Inference

OpenAIRE

Caticha, Ariel

2010-01-01

In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
Partial Tmem106b reduction does not correct abnormalities due to progranulin haploinsufficiency.

Science.gov (United States)

Arrant, Andrew E; Nicholson, Alexandra M; Zhou, Xiaolai; Rademakers, Rosa; Roberson, Erik D

2018-06-22

Loss of function mutations in progranulin (GRN) are a major cause of frontotemporal dementia (FTD). Progranulin is a secreted glycoprotein that localizes to lysosomes and is critical for proper lysosomal function. Heterozygous GRN mutation carriers develop FTD with TDP-43 pathology and exhibit signs of lysosomal dysfunction in the brain, with increased levels of lysosomal proteins and lipofuscin accumulation. Homozygous GRN mutation carriers develop neuronal ceroid lipofuscinosis (NCL), an earlier-onset lysosomal storage disorder caused by severe lysosomal dysfunction. Multiple genome-wide association studies have shown that risk of FTD in GRN mutation carriers is modified by polymorphisms in TMEM106B, which encodes a lysosomal membrane protein. Risk alleles of TMEM106B may increase TMEM106B levels through a variety of mechanisms. Brains from FTD patients with GRN mutations exhibit increased TMEM106B expression, and protective TMEM106B polymorphisms are associated with decreased TMEM106B expression. Together, these data raise the possibility that reduction of TMEM106B levels may protect against the pathogenic effects of progranulin haploinsufficiency. We crossed Tmem106b +/- mice with Grn +/- mice, which model the progranulin haploinsufficiency of GRN mutation carriers and develop age-dependent social deficits and lysosomal abnormalities in the brain. We tested whether partial Tmem106b reduction could normalize the social deficits and lysosomal abnormalities of Grn +/- mice. Partial reduction of Tmem106b levels did not correct the social deficits of Grn +/- mice. Tmem106b reduction also failed to normalize most lysosomal abnormalities of Grn +/- mice, except for β-glucuronidase activity, which was suppressed by Tmem106b reduction and increased by progranulin insufficiency. These data do not support the hypothesis that Tmem106b reduction protects against the pathogenic effects of progranulin haploinsufficiency, but do show that Tmem106b reduction normalizes some
Progranulin Gene Therapy Improves Lysosomal Dysfunction and Microglial Pathology Associated with Frontotemporal Dementia and Neuronal Ceroid Lipofuscinosis.

Science.gov (United States)

Arrant, Andrew E; Onyilo, Vincent C; Unger, Daniel E; Roberson, Erik D

2018-02-28

Loss-of-function mutations in progranulin, a lysosomal glycoprotein, cause neurodegenerative disease. Progranulin haploinsufficiency causes frontotemporal dementia (FTD) and complete progranulin deficiency causes CLN11 neuronal ceroid lipofuscinosis (NCL). Progranulin replacement is a rational therapeutic strategy for these disorders, but there are critical unresolved mechanistic questions about a progranulin gene therapy approach, including its potential to reverse existing pathology. Here, we address these issues using an AAV vector (AAV- Grn ) to deliver progranulin in Grn -/- mice (both male and female), which model aspects of NCL and FTD pathology, developing lysosomal dysfunction, lipofuscinosis, and microgliosis. We first tested whether AAV- Grn could improve preexisting pathology. Even with treatment after onset of pathology, AAV- Grn reduced lipofuscinosis in several brain regions of Grn -/- mice. AAV- Grn also reduced microgliosis in brain regions distant from the injection site. AAV-expressed progranulin was only detected in neurons, not in microglia, indicating that the microglial activation in progranulin deficiency can be improved by targeting neurons and thus may be driven at least in part by neuronal dysfunction. Even areas with sparse transduction and almost undetectable progranulin showed improvement, indicating that low-level replacement may be sufficiently effective. The beneficial effects of AAV- Grn did not require progranulin binding to sortilin. Finally, we tested whether AAV- Grn improved lysosomal function. AAV-derived progranulin was delivered to the lysosome, ameliorated the accumulation of LAMP-1 in Grn -/- mice, and corrected abnormal cathepsin D activity. These data shed light on progranulin biology and support progranulin-boosting therapies for NCL and FTD due to GRN mutations. SIGNIFICANCE STATEMENT Heterozygous loss-of-function progranulin ( GRN ) mutations cause frontotemporal dementia (FTD) and homozygous mutations cause neuronal

Learning Convex Inference of Marginals

OpenAIRE

Domke, Justin

2012-01-01

Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...
Fuzzy-Logic Based Distributed Energy-Efficient Clustering Algorithm for Wireless Sensor Networks.

Science.gov (United States)

Zhang, Ying; Wang, Jun; Han, Dezhi; Wu, Huafeng; Zhou, Rundong

2017-07-03

Due to the high-energy efficiency and scalability, the clustering routing algorithm has been widely used in wireless sensor networks (WSNs). In order to gather information more efficiently, each sensor node transmits data to its Cluster Head (CH) to which it belongs, by multi-hop communication. However, the multi-hop communication in the cluster brings the problem of excessive energy consumption of the relay nodes which are closer to the CH. These nodes' energy will be consumed more quickly than the farther nodes, which brings the negative influence on load balance for the whole networks. Therefore, we propose an energy-efficient distributed clustering algorithm based on fuzzy approach with non-uniform distribution (EEDCF). During CHs' election, we take nodes' energies, nodes' degree and neighbor nodes' residual energies into consideration as the input parameters. In addition, we take advantage of Takagi, Sugeno and Kang (TSK) fuzzy model instead of traditional method as our inference system to guarantee the quantitative analysis more reasonable. In our scheme, each sensor node calculates the probability of being as CH with the help of fuzzy inference system in a distributed way. The experimental results indicate EEDCF algorithm is better than some current representative methods in aspects of data transmission, energy consumption and lifetime of networks.
PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface.

Science.gov (United States)

Uszkoreit, Julian; Maerkens, Alexandra; Perez-Riverol, Yasset; Meyer, Helmut E; Marcus, Katrin; Stephan, Christian; Kohlbacher, Oliver; Eisenacher, Martin

2015-07-02

Protein inference connects the peptide spectrum matches (PSMs) obtained from database search engines back to proteins, which are typically at the heart of most proteomics studies. Different search engines yield different PSMs and thus different protein lists. Analysis of results from one or multiple search engines is often hampered by different data exchange formats and lack of convenient and intuitive user interfaces. We present PIA, a flexible software suite for combining PSMs from different search engine runs and turning these into consistent results. PIA can be integrated into proteomics data analysis workflows in several ways. A user-friendly graphical user interface can be run either locally or (e.g., for larger core facilities) from a central server. For automated data processing, stand-alone tools are available. PIA implements several established protein inference algorithms and can combine results from different search engines seamlessly. On several benchmark data sets, we show that PIA can identify a larger number of proteins at the same protein FDR when compared to that using inference based on a single search engine. PIA supports the majority of established search engines and data in the mzIdentML standard format. It is implemented in Java and freely available at https://github.com/mpc-bioinformatics/pia.
Super learning to hedge against incorrect inference from arbitrary parametric assumptions in marginal structural modeling.

Science.gov (United States)

Neugebauer, Romain; Fireman, Bruce; Roy, Jason A; Raebel, Marsha A; Nichols, Gregory A; O'Connor, Patrick J

2013-08-01

Clinical trials are unlikely to ever be launched for many comparative effectiveness research (CER) questions. Inferences from hypothetical randomized trials may however be emulated with marginal structural modeling (MSM) using observational data, but success in adjusting for time-dependent confounding and selection bias typically relies on parametric modeling assumptions. If these assumptions are violated, inferences from MSM may be inaccurate. In this article, we motivate the application of a data-adaptive estimation approach called super learning (SL) to avoid reliance on arbitrary parametric assumptions in CER. Using the electronic health records data from adults with new-onset type 2 diabetes, we implemented MSM with inverse probability weighting (IPW) estimation to evaluate the effect of three oral antidiabetic therapies on the worsening of glomerular filtration rate. Inferences from IPW estimation were noticeably sensitive to the parametric assumptions about the associations between both the exposure and censoring processes and the main suspected source of confounding, that is, time-dependent measurements of hemoglobin A1c. SL was successfully implemented to harness flexible confounding and selection bias adjustment from existing machine learning algorithms. Erroneous IPW inference about clinical effectiveness because of arbitrary and incorrect modeling decisions may be avoided with SL. Copyright © 2013 Elsevier Inc. All rights reserved.
A new fast method for inferring multiple consensus trees using k-medoids.

Science.gov (United States)

Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

2018-04-05

Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while
Inferring animal social networks and leadership: applications for passive monitoring arrays.

Science.gov (United States)

Jacoby, David M P; Papastamatiou, Yannis P; Freeman, Robin

2016-11-01

Analyses of animal social networks have frequently benefited from techniques derived from other disciplines. Recently, machine learning algorithms have been adopted to infer social associations from time-series data gathered using remote, telemetry systems situated at provisioning sites. We adapt and modify existing inference methods to reveal the underlying social structure of wide-ranging marine predators moving through spatial arrays of passive acoustic receivers. From six months of tracking data for grey reef sharks (Carcharhinus amblyrhynchos) at Palmyra atoll in the Pacific Ocean, we demonstrate that some individuals emerge as leaders within the population and that this behavioural coordination is predicted by both sex and the duration of co-occurrences between conspecifics. In doing so, we provide the first evidence of long-term, spatially extensive social processes in wild sharks. To achieve these results, we interrogate simulated and real tracking data with the explicit purpose of drawing attention to the key considerations in the use and interpretation of inference methods and their impact on resultant social structure. We provide a modified translation of the GMMEvents method for R, including new analyses quantifying the directionality and duration of social events with the aim of encouraging the careful use of these methods more widely in less tractable social animal systems but where passive telemetry is already widespread. © 2016 The Authors.
Inference of Transcription Regulatory Network in Low Phytic Acid Soybean Seeds

Directory of Open Access Journals (Sweden)

Neelam Redekar

2017-11-01

Full Text Available A dominant loss of function mutation in myo-inositol phosphate synthase (MIPS gene and recessive loss of function mutations in two multidrug resistant protein type-ABC transporter genes not only reduce the seed phytic acid levels in soybean, but also affect the pathways associated with seed development, ultimately resulting in low emergence. To understand the regulatory mechanisms and identify key genes that intervene in the seed development process in low phytic acid crops, we performed computational inference of gene regulatory networks in low and normal phytic acid soybeans using a time course transcriptomic data and multiple network inference algorithms. We identified a set of putative candidate transcription factors and their regulatory interactions with genes that have functions in myo-inositol biosynthesis, auxin-ABA signaling, and seed dormancy. We evaluated the performance of our unsupervised network inference method by comparing the predicted regulatory network with published regulatory interactions in Arabidopsis. Some contrasting regulatory interactions were observed in low phytic acid mutants compared to non-mutant lines. These findings provide important hypotheses on expression regulation of myo-inositol metabolism and phytohormone signaling in developing low phytic acid soybeans. The computational pipeline used for unsupervised network learning in this study is provided as open source software and is freely available at https://lilabatvt.github.io/LPANetwork/.
An adaptive map-matching algorithm based on hierarchical fuzzy system from vehicular GPS data.

Directory of Open Access Journals (Sweden)

Jinjun Tang

Full Text Available An improved hierarchical fuzzy inference method based on C-measure map-matching algorithm is proposed in this paper, in which the C-measure represents the certainty or probability of the vehicle traveling on the actual road. A strategy is firstly introduced to use historical positioning information to employ curve-curve matching between vehicle trajectories and shapes of candidate roads. It improves matching performance by overcoming the disadvantage of traditional map-matching algorithm only considering current information. An average historical distance is used to measure similarity between vehicle trajectories and road shape. The input of system includes three variables: distance between position point and candidate roads, angle between driving heading and road direction, and average distance. As the number of fuzzy rules will increase exponentially when adding average distance as a variable, a hierarchical fuzzy inference system is then applied to reduce fuzzy rules and improve the calculation efficiency. Additionally, a learning process is updated to support the algorithm. Finally, a case study contains four different routes in Beijing city is used to validate the effectiveness and superiority of the proposed method.
Probabilistic inductive inference: a survey

OpenAIRE

Ambainis, Andris

2001-01-01

Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.
LAIT: a local ancestry inference toolkit.

Science.gov (United States)

Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei

2017-09-06

Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.
Bayesian statistical inference

Directory of Open Access Journals (Sweden)

Bruno De Finetti

2017-04-01

Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Dissociation of frontotemporal dementia-related deficits and neuroinflammation in progranulin haploinsufficient mice.

Science.gov (United States)

Filiano, Anthony J; Martens, Lauren Herl; Young, Allen H; Warmus, Brian A; Zhou, Ping; Diaz-Ramirez, Grisell; Jiao, Jian; Zhang, Zhijun; Huang, Eric J; Gao, Fen-Biao; Farese, Robert V; Roberson, Erik D

2013-03-20

Frontotemporal dementia (FTD) is a neurodegenerative disease with hallmark deficits in social and emotional function. Heterozygous loss-of-function mutations in GRN, the progranulin gene, are a common genetic cause of the disorder, but the mechanisms by which progranulin haploinsufficiency causes neuronal dysfunction in FTD are unclear. Homozygous progranulin knock-out (Grn(-/-)) mice have been studied as a model of this disorder and show behavioral deficits and a neuroinflammatory phenotype with robust microglial activation. However, homozygous GRN mutations causing complete progranulin deficiency were recently shown to cause a different neurological disorder, neuronal ceroid lipofuscinosis, suggesting that the total absence of progranulin may have effects distinct from those of haploinsufficiency. Here, we studied progranulin heterozygous (Grn(+/-)) mice, which model progranulin haploinsufficiency. We found that Grn(+/-) mice developed age-dependent social and emotional deficits potentially relevant to FTD. However, unlike Grn(-/-) mice, behavioral deficits in Grn(+/-) mice occurred in the absence of gliosis or increased expression of tumor necrosis factor-α. Instead, we found neuronal abnormalities in the amygdala, an area of selective vulnerability in FTD, in Grn(+/-) mice. Our findings indicate that FTD-related deficits resulting from progranulin haploinsufficiency can develop in the absence of detectable gliosis and neuroinflammation, thereby dissociating microglial activation from functional deficits and suggesting an important effect of progranulin deficiency on neurons.
Griffin: A Tool for Symbolic Inference of Synchronous Boolean Molecular Networks

Directory of Open Access Journals (Sweden)

Stalin Muñoz

2018-03-01

Full Text Available Boolean networks are important models of biochemical systems, located at the high end of the abstraction spectrum. A number of Boolean gene networks have been inferred following essentially the same method. Such a method first considers experimental data for a typically underdetermined “regulation” graph. Next, Boolean networks are inferred by using biological constraints to narrow the search space, such as a desired set of (fixed-point or cyclic attractors. We describe Griffin, a computer tool enhancing this method. Griffin incorporates a number of well-established algorithms, such as Dubrova and Teslenko's algorithm for finding attractors in synchronous Boolean networks. In addition, a formal definition of regulation allows Griffin to employ “symbolic” techniques, able to represent both large sets of network states and Boolean constraints. We observe that when the set of attractors is required to be an exact set, prohibiting additional attractors, a naive Boolean coding of this constraint may be unfeasible. Such cases may be intractable even with symbolic methods, as the number of Boolean constraints may be astronomically large. To overcome this problem, we employ an Artificial Intelligence technique known as “clause learning” considerably increasing Griffin's scalability. Without clause learning only toy examples prohibiting additional attractors are solvable: only one out of seven queries reported here is answered. With clause learning, by contrast, all seven queries are answered. We illustrate Griffin with three case studies drawn from the Arabidopsis thaliana literature. Griffin is available at: http://turing.iimas.unam.mx/griffin.
A Comparative Study between SVM and Fuzzy Inference System for the Automatic Prediction of Sleep Stages and the Assessment of Sleep Quality

Directory of Open Access Journals (Sweden)

John Gialelis

2015-11-01

Full Text Available This paper compares two supervised learning algorithms for predicting the sleep stages based on the human brain activity. The first step of the presented work regards feature extraction from real human electroencephalography (EEG data together with its corresponding sleep stages that are utilized for training a support vector machine (SVM, and a fuzzy inference system (FIS algorithm. Then, the trained algorithms are used to predict the sleep stages of real human patients. Extended comparison results are demonstrated which indicate that both classifiers could be utilized as a basis for an unobtrusive sleep quality assessment.
Is there a hierarchy of social inferences? The likelihood and speed of inferring intentionality, mind, and personality.

Science.gov (United States)

Malle, Bertram F; Holbrook, Jess

2012-04-01

People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.
Advances in multi-sensor data fusion: algorithms and applications.

Science.gov (United States)

Dong, Jiang; Zhuang, Dafang; Huang, Yaohuan; Fu, Jingying

2009-01-01

With the development of satellite and remote sensing techniques, more and more image data from airborne/satellite sensors have become available. Multi-sensor image fusion seeks to combine information from different images to obtain more inferences than can be derived from a single sensor. In image-based application fields, image fusion has emerged as a promising research area since the end of the last century. The paper presents an overview of recent advances in multi-sensor satellite image fusion. Firstly, the most popular existing fusion algorithms are introduced, with emphasis on their recent improvements. Advances in main applications fields in remote sensing, including object identification, classification, change detection and maneuvering targets tracking, are described. Both advantages and limitations of those applications are then discussed. Recommendations are addressed, including: (1) Improvements of fusion algorithms; (2) Development of "algorithm fusion" methods; (3) Establishment of an automatic quality assessment scheme.
INFERENCE BUILDING BLOCKS

Science.gov (United States)

2018-02-15

expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation
Practical Bayesian Inference

Science.gov (United States)

Bailer-Jones, Coryn A. L.

2017-04-01

Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
A sub-cubic time algorithm for computing the quartet distance between two general trees

DEFF Research Database (Denmark)

Nielsen, Jesper; Kristensen, Anders Kabell; Mailund, Thomas

2011-01-01

Background When inferring phylogenetic trees different algorithms may give different trees. To study such effects a measure for the distance between two trees is useful. Quartet distance is one such measure, and is the number of quartet topologies that differ between two trees. Results We have...... derived a new algorithm for computing the quartet distance between a pair of general trees, i.e. trees where inner nodes can have any degree ≥ 3. The time and space complexity of our algorithm is sub-cubic in the number of leaves and does not depend on the degree of the inner nodes. This makes...... it the fastest algorithm so far for computing the quartet distance between general trees independent of the degree of the inner nodes. Conclusions We have implemented our algorithm and two of the best competitors. Our new algorithm is significantly faster than the competition and seems to run in close...
Quantitative assessment of protein activity in orphan tissues and single cells using the metaVIPER algorithm. | Office of Cancer Genomics

Science.gov (United States)

We and others have shown that transition and maintenance of biological states is controlled by master regulator proteins, which can be inferred by interrogating tissue-specific regulatory models (interactomes) with transcriptional signatures, using the VIPER algorithm. Yet, some tissues may lack molecular profiles necessary for interactome inference (orphan tissues), or, as for single cells isolated from heterogeneous samples, their tissue context may be undetermined.

Reduced miR-659-3p levels correlate with progranulin increase in hypoxic conditions: implications for frontotemporal dementia.

Directory of Open Access Journals (Sweden)

Paola ePiscopo

2016-05-01

Full Text Available Progranulin (PGRN is a secreted protein expressed ubiquitously throughout the body, including the brain, where it localizes in neurons and activated microglia. Loss-of-function mutations in the GRN gene are an important cause of familial Frontotemporal Lobar Degeneration (FTLD. PGRN has a neurotrophic and anti-inflammatory activity, and it is neuroprotective in several injury conditions, such as oxygen or glucose deprivation, oxidative injury, and hypoxic stress. Indeed, we have previously demonstrated that hypoxia induces the up-regulation of GRN transcripts. Several studies have shown microRNAs involvement in hypoxia. Moreover, in FTLD patients with a genetic variant of GRN (rs5848, the reinforcement of miR-659-3p binding site has been suggested to be a risk factor. Here, we report that miR-659-3p interacts directly with GRN 3’UTR as shown by luciferase assay in HeLa cells and ELISA and Western Blot analysis in HeLa and Kelly cells. Moreover, we demonstrate the physical binding between GRN mRNA and miR-659-3p employing a miRNA capture-affinity technology in SK-N-BE and Kelly cells. In order to study miRNAs involvement in hypoxia-mediated up-regulation of GRN, we evaluated miR-659-3p levels in SK-N-BE cells after 24h of hypoxic treatment, finding them inversely correlated to GRN transcripts. Furthermore, we analyzed an animal model of asphyxia, finding that GRN mRNA levels increased at post-natal day (pnd 1 and pnd 4 in rat cortices subjected to asphyxia in comparison to control rats and miR-659-3p decreased at pnd 4 just when GRN reached the highest levels. Our results demonstrate the interaction between miR-659-3p and GRN transcript and the involvement of miR-659-3p in GRN up-regulation mediated by hypoxic/ischemic insults.
Graphics Processing Unit–Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks

Science.gov (United States)

García-Calvo, Raúl; Guisado, JL; Diaz-del-Rio, Fernando; Córdoba, Antonio; Jiménez-Morales, Francisco

2018-01-01

Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes—master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)—is carried out for this problem. Several procedures that optimize the use of the GPU’s resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent
Graphics Processing Unit-Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks.

Science.gov (United States)

García-Calvo, Raúl; Guisado, J L; Diaz-Del-Rio, Fernando; Córdoba, Antonio; Jiménez-Morales, Francisco

2018-01-01

Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes-master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)-is carried out for this problem. Several procedures that optimize the use of the GPU's resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent
Association of TMEM106B gene polymorphism with age at onset in granulin mutation carriers and plasma granulin protein levels.

Science.gov (United States)

Cruchaga, Carlos; Graff, Caroline; Chiang, Huei-Hsin; Wang, Jun; Hinrichs, Anthony L; Spiegel, Noah; Bertelsen, Sarah; Mayo, Kevin; Norton, Joanne B; Morris, John C; Goate, Alison

2011-05-01

To test whether rs1990622 (TMEM106B) is associated with age at onset (AAO) in granulin (GRN) mutation carriers and with plasma GRN levels in mutation carriers and healthy, elderly individuals. Rs1990622 (TMEM106B) was identified as a risk factor for frontotemporal lobar degeneration with TAR DNA-binding protein inclusions (FTLD-TDP) in a recent genome-wide association. Rs1990622 was genotyped in GRN mutation carriers and tested for association with AAO using the Kaplan-Meier method and a Cox proportional hazards model. Alzheimer's Disease Research Center. Subjects We analyzed 50 affected and unaffected GRN mutation carriers from 4 previously reported FTLD-TDP families (HDDD1, FD1, HDDD2, and the Karolinska family). The GRN plasma levels were also measured in 73 healthy, elderly individuals. Age at onset and GRN plasma levels. The risk allele of rs1990622 was associated with a mean decrease of the AAO of 13 years (P = 9.9 × 10(-7)) and with lower plasma GRN levels in both healthy older adults (P = 4 × 10(-4)) and GRN mutation carriers (P = .0027). Analysis of the HapMap database identified a nonsynonymous single-nucleotide polymorphism rs3173615 (T185S) in perfect linkage disequilibrium with rs1990622. The association of rs1990622 with AAO explains, in part, the wide range in the AAO of disease among GRN mutation carriers. We hypothesize that rs1990622 or another variant in linkage disequilibrium could act in a manner similar to APOE in Alzheimer disease, increasing risk for disease in the general population and modifying AAO in mutation carriers. Our results also suggest that genetic variation in TMEM106B may influence risk for FTLD-TDP by modulating secreted levels of GRN.
Logical inference and evaluation

International Nuclear Information System (INIS)

Perey, F.G.

1981-01-01

Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given
AUTOMATIC ROAD GAP DETECTION USING FUZZY INFERENCE SYSTEM

Directory of Open Access Journals (Sweden)

S. Hashemi

2012-09-01

Full Text Available Automatic feature extraction from aerial and satellite images is a high-level data processing which is still one of the most important research topics of the field. In this area, most of the researches are focused on the early step of road detection, where road tracking methods, morphological analysis, dynamic programming and snakes, multi-scale and multi-resolution methods, stereoscopic and multi-temporal analysis, hyper spectral experiments, are some of the mature methods in this field. Although most researches are focused on detection algorithms, none of them can extract road network perfectly. On the other hand, post processing algorithms accentuated on the refining of road detection results, are not developed as well. In this article, the main is to design an intelligent method to detect and compensate road gaps remained on the early result of road detection algorithms. The proposed algorithm consists of five main steps as follow: 1 Short gap coverage: In this step, a multi-scale morphological is designed that covers short gaps in a hierarchical scheme. 2 Long gap detection: In this step, the long gaps, could not be covered in the previous stage, are detected using a fuzzy inference system. for this reason, a knowledge base consisting of some expert rules are designed which are fired on some gap candidates of the road detection results. 3 Long gap coverage: In this stage, detected long gaps are compensated by two strategies of linear and polynomials for this reason, shorter gaps are filled by line fitting while longer ones are compensated by polynomials.4 Accuracy assessment: In order to evaluate the obtained results, some accuracy assessment criteria are proposed. These criteria are obtained by comparing the obtained results with truly compensated ones produced by a human expert. The complete evaluation of the obtained results whit their technical discussions are the materials of the full paper.
Automatic Road Gap Detection Using Fuzzy Inference System

Science.gov (United States)

Hashemi, S.; Valadan Zoej, M. J.; Mokhtarzadeh, M.

2011-09-01

Automatic feature extraction from aerial and satellite images is a high-level data processing which is still one of the most important research topics of the field. In this area, most of the researches are focused on the early step of road detection, where road tracking methods, morphological analysis, dynamic programming and snakes, multi-scale and multi-resolution methods, stereoscopic and multi-temporal analysis, hyper spectral experiments, are some of the mature methods in this field. Although most researches are focused on detection algorithms, none of them can extract road network perfectly. On the other hand, post processing algorithms accentuated on the refining of road detection results, are not developed as well. In this article, the main is to design an intelligent method to detect and compensate road gaps remained on the early result of road detection algorithms. The proposed algorithm consists of five main steps as follow: 1) Short gap coverage: In this step, a multi-scale morphological is designed that covers short gaps in a hierarchical scheme. 2) Long gap detection: In this step, the long gaps, could not be covered in the previous stage, are detected using a fuzzy inference system. for this reason, a knowledge base consisting of some expert rules are designed which are fired on some gap candidates of the road detection results. 3) Long gap coverage: In this stage, detected long gaps are compensated by two strategies of linear and polynomials for this reason, shorter gaps are filled by line fitting while longer ones are compensated by polynomials.4) Accuracy assessment: In order to evaluate the obtained results, some accuracy assessment criteria are proposed. These criteria are obtained by comparing the obtained results with truly compensated ones produced by a human expert. The complete evaluation of the obtained results whit their technical discussions are the materials of the full paper.
EM algorithm for one-shot device testing with competing risks under exponential distribution

International Nuclear Information System (INIS)

Balakrishnan, N.; So, H.Y.; Ling, M.H.

2015-01-01

This paper provides an extension of the work of Balakrishnan and Ling [1] by introducing a competing risks model into a one-shot device testing analysis under an accelerated life test setting. An Expectation Maximization (EM) algorithm is then developed for the estimation of the model parameters. An extensive Monte Carlo simulation study is carried out to assess the performance of the EM algorithm and then compare the obtained results with the initial estimates obtained by the Inequality Constrained Least Squares (ICLS) method of estimation. Finally, we apply the EM algorithm to a clinical data, ED01, to illustrate the method of inference developed here. - Highlights: • ALT data analysis for one-shot devices with competing risks is considered. • EM algorithm is developed for the determination of the MLEs. • The estimations of lifetime under normal operating conditions are presented. • The EM algorithm improves the convergence rate
R Package multiPIM: A Causal Inference Approach to Variable Importance Analysis

Directory of Open Access Journals (Sweden)

Stephan J Ritter

2014-04-01

Full Text Available We describe the R package multiPIM, including statistical background, functionality and user options. The package is for variable importance analysis, and is meant primarily for analyzing data from exploratory epidemiological studies, though it could certainly be applied in other areas as well. The approach taken to variable importance comes from the causal inference field, and is different from approaches taken in other R packages. By default, multiPIM uses a double robust targeted maximum likelihood estimator (TMLE of a parameter akin to the attributable risk. Several regression methods/machine learning algorithms are available for estimating the nuisance parameters of the models, including super learner, a meta-learner which combines several different algorithms into one. We describe a simulation in which the double robust TMLE is compared to the graphical computation estimator. We also provide example analyses using two data sets which are included with the package.
ENHANCED PREDICTION OF STUDENT DROPOUTS USING FUZZY INFERENCE SYSTEM AND LOGISTIC REGRESSION

Directory of Open Access Journals (Sweden)

A. Saranya

2016-01-01

Full Text Available Predicting college and school dropouts is a major problem in educational system and has complicated challenge due to data imbalance and multi dimensionality, which can affect the low performance of students. In this paper, we have collected different database from various colleges, among these 500 best real attributes are identified in order to identify the factor that affecting dropout students using neural based classification algorithm and different mining technique are implemented for data processing. We also propose a Dropout Prediction Algorithm (DPA using fuzzy logic and Logistic Regression based inference system because the weighted average will improve the performance of whole system. We are experimented our proposed work with all other classification systems and documented as the best outcomes. The aggregated data is given to the decision trees for better dropout prediction. The accuracy of overall system 98.6% it shows the proposed work depicts efficient prediction.
In vitro percutaneous absorption enhancement of granisetron by chemical penetration enhancers.

Science.gov (United States)

Zhao, Nanxi; Cun, Dongmei; Li, Wei; Ma, Xu; Sun, Lin; Xi, Honglei; Li, Li; Fang, Liang

2013-04-01

Granisetron (GRN), a potent antiemetic agent, is frequently used to prevent nausea and vomiting induced by cancer cytotoxic chemotherapy and radiation therapy. As part of our efforts to further modify the physicochemical properties of this market drug, with the ultimate goal to formulate a better dosage form for GRN, this work was carried out to improve its permeability in vitro. The permeation behavior of GRN in isopropyl myristate (IPM) was investigated across excised rabbit abdominal skin and the enhancing activities of three novel O-acylmenthol derivatives synthesized in our laboratory as well as five well-known chemical enhancers were evaluated. It was found that the steady-state flux of granisetron free base (GRN-B) was about 26-fold higher than that of granisetron hydrochloride (GRN-H). The novel enhancer, 2-isopropyl-5-methylcyclohexyl heptanoate (M-HEP), was observed to provide the most significant enhancement for the absorption of GRN-B. When incorporated in the donor solution with the optimal enhancer M-HEP, the steady-state flux of GRN-B increased from (196.44 ± 12.03) μg·cm⁻²·h⁻¹ to (1044.95 ± 71.99) μg·cm⁻²·h⁻¹ (P < 0.01). These findings indicated that the application of chemical enhancers was an effective approach to increase the percutaneous absorption of GRN in vitro.
Murine knockin model for progranulin-deficient frontotemporal dementia with nonsense-mediated mRNA decay.

Science.gov (United States)

Nguyen, Andrew D; Nguyen, Thi A; Zhang, Jiasheng; Devireddy, Swathi; Zhou, Ping; Karydas, Anna M; Xu, Xialian; Miller, Bruce L; Rigo, Frank; Ferguson, Shawn M; Huang, Eric J; Walther, Tobias C; Farese, Robert V

2018-03-20

Frontotemporal dementia (FTD) is the most common neurodegenerative disorder in individuals under age 60 and has no treatment or cure. Because many cases of FTD result from GRN nonsense mutations, an animal model for this type of mutation is highly desirable for understanding pathogenesis and testing therapies. Here, we generated and characterized Grn R493X knockin mice, which model the most common human GRN mutation, a premature stop codon at arginine 493 (R493X). Homozygous Grn R493X mice have markedly reduced Grn mRNA levels, lack detectable progranulin protein, and phenocopy Grn knockout mice, with CNS microgliosis, cytoplasmic TDP-43 accumulation, reduced synaptic density, lipofuscinosis, hyperinflammatory macrophages, excessive grooming behavior, and reduced survival. Inhibition of nonsense-mediated mRNA decay (NMD) by genetic, pharmacological, or antisense oligonucleotide-based approaches showed that NMD contributes to the reduced mRNA levels in Grn R493X mice and cell lines and in fibroblasts from patients containing the GRN R493X mutation. Moreover, the expressed truncated R493X mutant protein was functional in several assays in progranulin-deficient cells. Together, these findings establish a murine model for in vivo testing of NMD inhibition or other therapies as potential approaches for treating progranulin deficiency caused by the R493X mutation. Copyright © 2018 the Author(s). Published by PNAS.
Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

OpenAIRE

Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

2009-01-01

Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...
Inference

DEFF Research Database (Denmark)

Møller, Jesper

(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Estimating the Optimal Dosage of Sodium Valproate in Idiopathic Generalized Epilepsy with Adaptive Neuro-Fuzzy Inference System

Directory of Open Access Journals (Sweden)

Somayyeh Lotfi Noghabi

2012-07-01

Full Text Available Introduction: Epilepsy is a clinical syndrome in which seizures have a tendency to recur. Sodium valproate is the most effective drug in the treatment of all types of generalized seizures. Finding the optimal dosage (the lowest effective dose of sodium valproate is a real challenge to all neurologists. In this study, a new approach based on Adaptive Neuro-Fuzzy Inference System (ANFIS was presented for estimating the optimal dosage of sodium valproate in IGE (Idiopathic Generalized Epilepsy patients. Methods: 40 patients with Idiopathic Generalized Epilepsy, who were referred to the neurology department of Mashhad University of Medical Sciences between the years 2006-2011, were included in this study. The function Adaptive Neuro- Fuzzy Inference System (ANFIS constructs a Fuzzy Inference System (FIS whose membership function parameters are tuned (adjusted using either a back-propagation algorithm alone, or in combination with the least squares type of method (hybrid algorithm. In this study, we used hybrid method for adjusting the parameters. Methods: The R-square of the proposed system was %598 and the Pearson correlation coefficient was significant (P 0.05. Although the accuracy of the model was not high, it wasgood enough to be applied for treating the IGE patients with sodium valproate. Discussion: This paper presented a new application of ANFIS for estimating the optimal dosage of sodium valproate in IGE patients. Fuzzy set theory plays an important role in dealing with uncertainty when making decisions in medical applications. Collectively, it seems that ANFIS has a high capacity to be applied in medical sciences, especially neurology.
Functional inference of complex anatomical tendinous networks at a macroscopic scale via sparse experimentation.

Science.gov (United States)

Saxena, Anupam; Lipson, Hod; Valero-Cuevas, Francisco J

2012-01-01

In systems and computational biology, much effort is devoted to functional identification of systems and networks at the molecular-or cellular scale. However, similarly important networks exist at anatomical scales such as the tendon network of human fingers: the complex array of collagen fibers that transmits and distributes muscle forces to finger joints. This network is critical to the versatility of the human hand, and its function has been debated since at least the 16(th) century. Here, we experimentally infer the structure (both topology and parameter values) of this network through sparse interrogation with force inputs. A population of models representing this structure co-evolves in simulation with a population of informative future force inputs via the predator-prey estimation-exploration algorithm. Model fitness depends on their ability to explain experimental data, while the fitness of future force inputs depends on causing maximal functional discrepancy among current models. We validate our approach by inferring two known synthetic Latex networks, and one anatomical tendon network harvested from a cadaver's middle finger. We find that functionally similar but structurally diverse models can exist within a narrow range of the training set and cross-validation errors. For the Latex networks, models with low training set error [functional structure of complex anatomical networks. This work expands current bioinformatics inference approaches by demonstrating that sparse, yet informative interrogation of biological specimens holds significant computational advantages in accurate and efficient inference over random testing, or assuming model topology and only inferring parameters values. These findings also hold clues to both our evolutionary history and the development of versatile machines.
Calibrated birth-death phylogenetic time-tree priors for bayesian inference.

Science.gov (United States)

Heled, Joseph; Drummond, Alexei J

2015-05-01

Here we introduce a general class of multiple calibration birth-death tree priors for use in Bayesian phylogenetic inference. All tree priors in this class separate ancestral node heights into a set of "calibrated nodes" and "uncalibrated nodes" such that the marginal distribution of the calibrated nodes is user-specified whereas the density ratio of the birth-death prior is retained for trees with equal values for the calibrated nodes. We describe two formulations, one in which the calibration information informs the prior on ranked tree topologies, through the (conditional) prior, and the other which factorizes the prior on divergence times and ranked topologies, thus allowing uniform, or any arbitrary prior distribution on ranked topologies. Although the first of these formulations has some attractive properties, the algorithm we present for computing its prior density is computationally intensive. However, the second formulation is always faster and computationally efficient for up to six calibrations. We demonstrate the utility of the new class of multiple-calibration tree priors using both small simulations and a real-world analysis and compare the results to existing schemes. The two new calibrated tree priors described in this article offer greater flexibility and control of prior specification in calibrated time-tree inference and divergence time dating, and will remove the need for indirect approaches to the assessment of the combined effect of calibration densities and tree priors in Bayesian phylogenetic inference. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Variations on Bayesian Prediction and Inference

Science.gov (United States)

2016-05-09

inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
A Monte Carlo Metropolis-Hastings Algorithm for Sampling from Distributions with Intractable Normalizing Constants

KAUST Repository

Liang, Faming; Jin, Ick-Hoon

2013-01-01

Simulating from distributions with intractable normalizing constants has been a long-standing problem inmachine learning. In this letter, we propose a new algorithm, the Monte Carlo Metropolis-Hastings (MCMH) algorithm, for tackling this problem. The MCMH algorithm is a Monte Carlo version of the Metropolis-Hastings algorithm. It replaces the unknown normalizing constant ratio by a Monte Carlo estimate in simulations, while still converges, as shown in the letter, to the desired target distribution under mild conditions. The MCMH algorithm is illustrated with spatial autologistic models and exponential random graph models. Unlike other auxiliary variable Markov chain Monte Carlo (MCMC) algorithms, such as the Møller and exchange algorithms, the MCMH algorithm avoids the requirement for perfect sampling, and thus can be applied to many statistical models for which perfect sampling is not available or very expensive. TheMCMHalgorithm can also be applied to Bayesian inference for random effect models and missing data problems that involve simulations from a distribution with intractable integrals. © 2013 Massachusetts Institute of Technology.
A Monte Carlo Metropolis-Hastings Algorithm for Sampling from Distributions with Intractable Normalizing Constants

KAUST Repository

Liang, Faming

2013-08-01

Simulating from distributions with intractable normalizing constants has been a long-standing problem inmachine learning. In this letter, we propose a new algorithm, the Monte Carlo Metropolis-Hastings (MCMH) algorithm, for tackling this problem. The MCMH algorithm is a Monte Carlo version of the Metropolis-Hastings algorithm. It replaces the unknown normalizing constant ratio by a Monte Carlo estimate in simulations, while still converges, as shown in the letter, to the desired target distribution under mild conditions. The MCMH algorithm is illustrated with spatial autologistic models and exponential random graph models. Unlike other auxiliary variable Markov chain Monte Carlo (MCMC) algorithms, such as the Møller and exchange algorithms, the MCMH algorithm avoids the requirement for perfect sampling, and thus can be applied to many statistical models for which perfect sampling is not available or very expensive. TheMCMHalgorithm can also be applied to Bayesian inference for random effect models and missing data problems that involve simulations from a distribution with intractable integrals. © 2013 Massachusetts Institute of Technology.

SVC control enhancement applying self-learning fuzzy algorithm for islanded microgrid

Directory of Open Access Journals (Sweden)

Hossam Gabbar

2016-03-01

Full Text Available Maintaining voltage stability, within acceptable levels, for islanded Microgrids (MGs is a challenge due to limited exchange power between generation and loads. This paper proposes an algorithm to enhance the dynamic performance of islanded MGs in presence of load disturbance using Static VAR Compensator (SVC with Fuzzy Model Reference Learning Controller (FMRLC. The proposed algorithm compensates MG nonlinearity via fuzzy membership functions and inference mechanism imbedded in both controller and inverse model. Hence, MG keeps the desired performance as required at any operating condition. Furthermore, the self-learning capability of the proposed control algorithm compensates for grid parameter’s variation even with inadequate information about load dynamics. A reference model was designed to reject bus voltage disturbance with achievable performance by the proposed fuzzy controller. Three simulations scenarios have been presented to investigate effectiveness of proposed control algorithm in improving steady-state and transient performance of islanded MGs. The first scenario conducted without SVC, second conducted with SVC using PID controller and third conducted using FMRLC algorithm. A comparison for results shows ability of proposed control algorithm to enhance disturbance rejection due to learning process.
Prediksi Kelulusan Mata Kuliah Menggunakan Hybrid Fuzzy Inference System

Directory of Open Access Journals (Sweden)

Abidatul Izzah

2016-07-01

. Therefore, in this study, we used a Decision Tree (DT technique for generate the rules. So, the research aims to predict courses graduation using hybrid FIS and DT. Dataset used is the posttest score, tasks score, quizzes score, and middle test score from 106 students of the Polytechnic Kediri who took Algorithms and Data Structures. The research started by generating 5 rules by decision tree. The next is implementation of FIS that consist of fuzzification, inference, and defuzzification. The results show that the classifier give a good result in an accuracy, sensitivity, and specificity respectively was 94.33%, 96.55% and 84.21%.Keywords: Decision Tree, Educational Data Mining, Fuzzy Inference System, Prediction.
Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

Directory of Open Access Journals (Sweden)

Charleston W. K. Chiang

2016-05-01

Full Text Available Identity-by-descent (IBD is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable.
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks

Energy Technology Data Exchange (ETDEWEB)

Kim, M; Leskovec, J

2011-11-14

Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.
Fuzzy Dynamic Discrimination Algorithms for Distributed Knowledge Management Systems

Directory of Open Access Journals (Sweden)

Vasile MAZILESCU

2010-12-01

Full Text Available A reduction of the algorithmic complexity of the fuzzy inference engine has the following property: the inputs (the fuzzy rules and the fuzzy facts can be divided in two parts, one being relatively constant for a long a time (the fuzzy rule or the knowledge model when it is compared to the second part (the fuzzy facts for every inference cycle. The occurrence of certain transformations over the constant part makes sense, in order to decrease the solution procurement time, in the case that the second part varies, but it is known at certain moments in time. The transformations attained in advance are called pre-processing or knowledge compilation. The use of variables in a Business Rule Management System knowledge representation allows factorising knowledge, like in classical knowledge based systems. The language of the first-degree predicates facilitates the formulation of complex knowledge in a rigorous way, imposing appropriate reasoning techniques. It is, thus, necessary to define the description method of fuzzy knowledge, to justify the knowledge exploiting efficiency when the compiling technique is used, to present the inference engine and highlight the functional features of the pattern matching and the state space processes. This paper presents the main results of our project PR356 for designing a compiler for fuzzy knowledge, like Rete compiler, that comprises two main components: a static fuzzy discrimination structure (Fuzzy Unification Tree and the Fuzzy Variables Linking Network. There are also presented the features of the elementary pattern matching process that is based on the compiled structure of fuzzy knowledge. We developed fuzzy discrimination algorithms for Distributed Knowledge Management Systems (DKMSs. The implementations have been elaborated in a prototype system FRCOM (Fuzzy Rule COMpiler.
RESOLVE: A new algorithm for aperture synthesis imaging of extended emission in radio astronomy

Science.gov (United States)

Junklewitz, H.; Bell, M. R.; Selig, M.; Enßlin, T. A.

2016-02-01

We present resolve, a new algorithm for radio aperture synthesis imaging of extended and diffuse emission in total intensity. The algorithm is derived using Bayesian statistical inference techniques, estimating the surface brightness in the sky assuming a priori log-normal statistics. resolve estimates the measured sky brightness in total intensity, and the spatial correlation structure in the sky, which is used to guide the algorithm to an optimal reconstruction of extended and diffuse sources. During this process, the algorithm succeeds in deconvolving the effects of the radio interferometric point spread function. Additionally, resolve provides a map with an uncertainty estimate of the reconstructed surface brightness. Furthermore, with resolve we introduce a new, optimal visibility weighting scheme that can be viewed as an extension to robust weighting. In tests using simulated observations, the algorithm shows improved performance against two standard imaging approaches for extended sources, Multiscale-CLEAN and the Maximum Entropy Method.
Image Analysis of Endosocopic Ultrasonography in Submucosal Tumor Using Fuzzy Inference

Directory of Open Access Journals (Sweden)

Kwang Baek Kim

2013-01-01

Full Text Available Endoscopists usually make a diagnosis in the submucosal tumor depending on the subjective evaluation about general images obtained by endoscopic ultrasonography. In this paper, we propose a method to extract areas of gastrointestinal stromal tumor (GIST and lipoma automatically from the ultrasonic image to assist those specialists. We also propose an algorithm to differentiate GIST from non-GIST by fuzzy inference from such images after applying ROC curve with mean and standard deviation of brightness information. In experiments using real images that medical specialists use, we verify that our method is sufficiently helpful for such specialists for efficient classification of submucosal tumors.
Time Series Modeling of Nano-Gold Immunochromatographic Assay via Expectation Maximization Algorithm.

Science.gov (United States)

Zeng, Nianyin; Wang, Zidong; Li, Yurong; Du, Min; Cao, Jie; Liu, Xiaohui

2013-12-01

In this paper, the expectation maximization (EM) algorithm is applied to the modeling of the nano-gold immunochromatographic assay (nano-GICA) via available time series of the measured signal intensities of the test and control lines. The model for the nano-GICA is developed as the stochastic dynamic model that consists of a first-order autoregressive stochastic dynamic process and a noisy measurement. By using the EM algorithm, the model parameters, the actual signal intensities of the test and control lines, as well as the noise intensity can be identified simultaneously. Three different time series data sets concerning the target concentrations are employed to demonstrate the effectiveness of the introduced algorithm. Several indices are also proposed to evaluate the inferred models. It is shown that the model fits the data very well.
Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations

Science.gov (United States)

Bovy Jo; Hogg, David W.; Roweis, Sam T.

2011-06-01

We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-"erge- procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional veloc! ity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.
Geographic population structure analysis of worldwide human populations infers their biogeographical origins

Science.gov (United States)

Elhaik, Eran; Tatarinova, Tatiana; Chebotarev, Dmitri; Piras, Ignazio S.; Maria Calò, Carla; De Montis, Antonella; Atzori, Manuela; Marini, Monica; Tofanelli, Sergio; Francalacci, Paolo; Pagani, Luca; Tyler-Smith, Chris; Xue, Yali; Cucca, Francesco; Schurr, Theodore G.; Gaieski, Jill B.; Melendez, Carlalynne; Vilar, Miguel G.; Owings, Amanda C.; Gómez, Rocío; Fujita, Ricardo; Santos, Fabrício R.; Comas, David; Balanovsky, Oleg; Balanovska, Elena; Zalloua, Pierre; Soodyall, Himla; Pitchappan, Ramasamy; GaneshPrasad, ArunKumar; Hammer, Michael; Matisoo-Smith, Lisa; Wells, R. Spencer; Acosta, Oscar; Adhikarla, Syama; Adler, Christina J.; Bertranpetit, Jaume; Clarke, Andrew C.; Cooper, Alan; Der Sarkissian, Clio S. I.; Haak, Wolfgang; Haber, Marc; Jin, Li; Kaplan, Matthew E.; Li, Hui; Li, Shilin; Martínez-Cruz, Begoña; Merchant, Nirav C.; Mitchell, John R.; Parida, Laxmi; Platt, Daniel E.; Quintana-Murci, Lluis; Renfrew, Colin; Lacerda, Daniela R.; Royyuru, Ajay K.; Sandoval, Jose Raul; Santhakumari, Arun Varatharajan; Soria Hernanz, David F.; Swamikrishnan, Pandikumar; Ziegle, Janet S.

2014-01-01

The search for a method that utilizes biological information to predict humans’ place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700 km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000–130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50 km of their villages. GPS’s accuracy and power to infer the biogeography of worldwide individuals down to their country or, in some cases, village, of origin, underscores the promise of admixture-based methods for biogeography and has ramifications for genetic ancestry testing. PMID:24781250
CGBayesNets: conditional Gaussian Bayesian network learning and inference with mixed discrete and continuous data.

Science.gov (United States)

McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T

2014-06-01

Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
The inference from a single case: moral versus scientific inferences in implementing new biotechnologies.

Science.gov (United States)

Hofmann, B

2008-06-01

Are there similarities between scientific and moral inference? This is the key question in this article. It takes as its point of departure an instance of one person's story in the media changing both Norwegian public opinion and a brand-new Norwegian law prohibiting the use of saviour siblings. The case appears to falsify existing norms and to establish new ones. The analysis of this case reveals similarities in the modes of inference in science and morals, inasmuch as (a) a single case functions as a counter-example to an existing rule; (b) there is a common presupposition of stability, similarity and order, which makes it possible to reason from a few cases to a general rule; and (c) this makes it possible to hold things together and retain order. In science, these modes of inference are referred to as falsification, induction and consistency. In morals, they have a variety of other names. Hence, even without abandoning the fact-value divide, there appear to be similarities between inference in science and inference in morals, which may encourage communication across the boundaries between "the two cultures" and which are relevant to medical humanities.
Forecasting Monthly Electricity Demands by Wavelet Neuro-Fuzzy System Optimized by Heuristic Algorithms

Directory of Open Access Journals (Sweden)

Jeng-Fung Chen

2018-02-01

Full Text Available Electricity load forecasting plays a paramount role in capacity planning, scheduling, and the operation of power systems. Reliable and accurate planning and prediction of electricity load are therefore vital. In this study, a novel approach for forecasting monthly electricity demands by wavelet transform and a neuro-fuzzy system is proposed. Firstly, the most appropriate inputs are selected and a dataset is constructed. Then, Haar wavelet transform is utilized to decompose the load data and eliminate noise. In the model, a hierarchical adaptive neuro-fuzzy inference system (HANFIS is suggested to solve the curse-of-dimensionality problem. Several heuristic algorithms including Gravitational Search Algorithm (GSA, Cuckoo Optimization Algorithm (COA, and Cuckoo Search (CS are utilized to optimize the clustering parameters which help form the rule base, and adaptive neuro-fuzzy inference system (ANFIS optimize the parameters in the antecedent and consequent parts of each sub-model. The proposed approach was applied to forecast the electricity load of Hanoi, Vietnam. The constructed models have shown high forecasting performances based on the performance indices calculated. The results demonstrate the validity of the approach. The obtained results were also compared with those of several other well-known methods including autoregressive integrated moving average (ARIMA and multiple linear regression (MLR. In our study, the wavelet CS-HANFIS model outperformed the others and provided more accurate forecasting.
A novel telomerase activator suppresses lung damage in a murine model of idiopathic pulmonary fibrosis.

Science.gov (United States)

Le Saux, Claude Jourdan; Davy, Philip; Brampton, Christopher; Ahuja, Seema S; Fauce, Steven; Shivshankar, Pooja; Nguyen, Hieu; Ramaseshan, Mahesh; Tressler, Robert; Pirot, Zhu; Harley, Calvin B; Allsopp, Richard

2013-01-01

The emergence of diseases associated with telomere dysfunction, including AIDS, aplastic anemia and pulmonary fibrosis, has bolstered interest in telomerase activators. We report identification of a new small molecule activator, GRN510, with activity ex vivo and in vivo. Using a novel mouse model, we tested the potential of GRN510 to limit fibrosis induced by bleomycin in mTERT heterozygous mice. Treatment with GRN510 at 10 mg/kg/day activated telomerase 2-4 fold both in hematopoietic progenitors ex vivo and in bone marrow and lung tissue in vivo, respectively. Telomerase activation was countered by co-treatment with Imetelstat (GRN163L), a potent telomerase inhibitor. In this model of bleomycin-induced fibrosis, treatment with GRN510 suppressed the development of fibrosis and accumulation of senescent cells in the lung via a mechanism dependent upon telomerase activation. Treatment of small airway epithelial cells (SAEC) or lung fibroblasts ex vivo with GRN510 revealed telomerase activating and replicative lifespan promoting effects only in the SAEC, suggesting that the mechanism accounting for the protective effects of GRN510 against induced lung fibrosis involves specific types of lung cells. Together, these results support the use of small molecule activators of telomerase in therapies to treat idiopathic pulmonary fibrosis.
Statistical inference using weak chaos and infinite memory

International Nuclear Information System (INIS)

Welling, Max; Chen Yutian

2010-01-01

We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.
Statistical inference using weak chaos and infinite memory

Energy Technology Data Exchange (ETDEWEB)

Welling, Max; Chen Yutian, E-mail: welling@ics.uci.ed, E-mail: yutian.chen@uci.ed [Donald Bren School of Information and Computer Science, University of California Irvine CA 92697-3425 (United States)

2010-06-01

We describe a class of deterministic weakly chaotic dynamical systems with infinite memory. These 'herding systems' combine learning and inference into one algorithm, where moments or data-items are converted directly into an arbitrarily long sequence of pseudo-samples. This sequence has infinite range correlations and as such is highly structured. We show that its information content, as measured by sub-extensive entropy, can grow as fast as K log T, which is faster than the usual 1/2 K log T for exchangeable sequences generated by random posterior sampling from a Bayesian model. In one dimension we prove that herding sequences are equivalent to Sturmian sequences which have complexity exactly log(T + 1). More generally, we advocate the application of the rich theoretical framework around nonlinear dynamical systems, chaos theory and fractal geometry to statistical learning.
Haplotype inference in general pedigrees with two sites

Directory of Open Access Journals (Sweden)

Doan Duong D

2011-04-01

Full Text Available Abstract Background Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with two sites for all members. Results We show that this NP-hard problem can be parametrically reduced to the Bipartization by Edge Removal problem and therefore can be solved by an O(2k · n2 exact algorithm, where n is the number of members and k is the number of recombination events. Conclusions Our work can therefore be useful for genetic disease studies to track down how changes in haplotypes such as recombinations relate to genetic disease.
Introductory statistical inference

CERN Document Server

Mukhopadhyay, Nitis

2014-01-01

This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist
Active inference, communication and hermeneutics.

Science.gov (United States)

Friston, Karl J; Frith, Christopher D

2015-07-01

Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
An Online Causal Inference Framework for Modeling and Designing Systems Involving User Preferences: A State-Space Approach

Directory of Open Access Journals (Sweden)

Ibrahim Delibalta

2017-01-01

Full Text Available We provide a causal inference framework to model the effects of machine learning algorithms on user preferences. We then use this mathematical model to prove that the overall system can be tuned to alter those preferences in a desired manner. A user can be an online shopper or a social media user, exposed to digital interventions produced by machine learning algorithms. A user preference can be anything from inclination towards a product to a political party affiliation. Our framework uses a state-space model to represent user preferences as latent system parameters which can only be observed indirectly via online user actions such as a purchase activity or social media status updates, shares, blogs, or tweets. Based on these observations, machine learning algorithms produce digital interventions such as targeted advertisements or tweets. We model the effects of these interventions through a causal feedback loop, which alters the corresponding preferences of the user. We then introduce algorithms in order to estimate and later tune the user preferences to a particular desired form. We demonstrate the effectiveness of our algorithms through experiments in different scenarios.

Mechanical fault diagnostics for induction motor with variable speed drives using Adaptive Neuro-fuzzy Inference System

Energy Technology Data Exchange (ETDEWEB)

Ye, Z. [Department of Electrical & amp; Computer Engineering, Queen' s University, Kingston, Ont. (Canada K7L 3N6); Sadeghian, A. [Department of Computer Science, Ryerson University, Toronto, Ont. (Canada M5B 2K3); Wu, B. [Department of Electrical & amp; Computer Engineering, Ryerson University, Toronto, Ont. (Canada M5B 2K3)

2006-06-15

A novel online diagnostic algorithm for mechanical faults of electrical machines with variable speed drive systems is presented in this paper. Using Wavelet Packet Decomposition (WPD), a set of feature coefficients, represented with different frequency resolutions, related to the mechanical faults is extracted from the stator current of the induction motors operating over a wide range of speeds. A new integrated diagnostic system for electrical machine mechanical faults is then proposed using multiple Adaptive Neuro-fuzzy Inference Systems (ANFIS). This paper shows that using multiple ANFIS units significantly reduces the scale and complexity of the system and speeds up the training of the network. The diagnostic algorithm is validated on a three-phase induction motor drive system, and it is proven to be capable of detecting rotor bar breakage and air gap eccentricity faults with high accuracy. The algorithm is applicable to a variety of industrial applications where either continuous on-line monitoring or off-line fault diagnostics is required. (author)
Space-Time Joint Interference Cancellation Using Fuzzy-Inference-Based Adaptive Filtering Techniques in Frequency-Selective Multipath Channels

Directory of Open Access Journals (Sweden)

Chen Yu-Fan

2006-01-01

Full Text Available An adaptive minimum mean-square error (MMSE array receiver based on the fuzzy-logic recursive least-squares (RLS algorithm is developed for asynchronous DS-CDMA interference suppression in the presence of frequency-selective multipath fading. This receiver employs a fuzzy-logic control mechanism to perform the nonlinear mapping of the squared error and squared error variation, denoted by ( , , into a forgetting factor . For the real-time applicability, a computationally efficient version of the proposed receiver is derived based on the least-mean-square (LMS algorithm using the fuzzy-inference-controlled step-size . This receiver is capable of providing both fast convergence/tracking capability as well as small steady-state misadjustment as compared with conventional LMS- and RLS-based MMSE DS-CDMA receivers. Simulations show that the fuzzy-logic LMS and RLS algorithms outperform, respectively, other variable step-size LMS (VSS-LMS and variable forgetting factor RLS (VFF-RLS algorithms at least 3 dB and 1.5 dB in bit-error-rate (BER for multipath fading channels.
Effective network inference through multivariate information transfer estimation

Science.gov (United States)

Dahlqvist, Carl-Henrik; Gnabo, Jean-Yves

2018-06-01

Network representation has steadily gained in popularity over the past decades. In many disciplines such as finance, genetics, neuroscience or human travel to cite a few, the network may not directly be observable and needs to be inferred from time-series data, leading to the issue of separating direct interactions between two entities forming the network from indirect interactions coming through its remaining part. Drawing on recent contributions proposing strategies to deal with this problem such as the so-called "global silencing" approach of Barzel and Barabasi or "network deconvolution" of Feizi et al. (2013), we propose a novel methodology to infer an effective network structure from multivariate conditional information transfers. Its core principal is to test the information transfer between two nodes through a step-wise approach by conditioning the transfer for each pair on a specific set of relevant nodes as identified by our algorithm from the rest of the network. The methodology is model free and can be applied to high-dimensional networks with both inter-lag and intra-lag relationships. It outperforms state-of-the-art approaches for eliminating the redundancies and more generally retrieving simulated artificial networks in our Monte-Carlo experiments. We apply the method to stock market data at different frequencies (15 min, 1 h, 1 day) to retrieve the network of US largest financial institutions and then document how bank's centrality measurements relate to bank's systemic vulnerability.
Portable inference engine: An extended CLIPS for real-time production systems

Science.gov (United States)

Le, Thach; Homeier, Peter

1988-01-01

The present C-Language Integrated Production System (CLIPS) architecture has not been optimized to deal with the constraints of real-time production systems. Matching in CLIPS is based on the Rete Net algorithm, whose assumption of working memory stability might fail to be satisfied in a system subject to real-time dataflow. Further, the CLIPS forward-chaining control mechanism with a predefined conflict resultion strategy may not effectively focus the system's attention on situation-dependent current priorties, or appropriately address different kinds of knowledge which might appear in a given application. Portable Inference Engine (PIE) is a production system architecture based on CLIPS which attempts to create a more general tool while addressing the problems of real-time expert systems. Features of the PIE design include a modular knowledge base, a modified Rete Net algorithm, a bi-directional control strategy, and multiple user-defined conflict resolution strategies. Problems associated with real-time applications are analyzed and an explanation is given for how the PIE architecture addresses these problems.
Inference of Tumor Phylogenies with Improved Somatic Mutation Discovery

KAUST Repository

Salari, Raheleh

2013-01-01

Next-generation sequencing technologies provide a powerful tool for studying genome evolution during progression of advanced diseases such as cancer. Although many recent studies have employed new sequencing technologies to detect mutations across multiple, genetically related tumors, current methods do not exploit available phylogenetic information to improve the accuracy of their variant calls. Here, we present a novel algorithm that uses somatic single nucleotide variations (SNVs) in multiple, related tissue samples as lineage markers for phylogenetic tree reconstruction. Our method then leverages the inferred phylogeny to improve the accuracy of SNV discovery. Experimental analyses demonstrate that our method achieves up to 32% improvement for somatic SNV calling of multiple related samples over the accuracy of GATK\\'s Unified Genotyper, the state of the art multisample SNV caller. © 2013 Springer-Verlag.
RISK MANAGEMENT AUTOMATION OF SOFTWARE PROJECTS BASED ОN FUZZY INFERENCE

Directory of Open Access Journals (Sweden)

T. M. Zubkova

2015-09-01

Full Text Available Application suitability for one of the intelligent methods for risk management of software projects has been shown based on the review of existing algorithms for fuzzy inference in the field of applied problems. Information sources in the management of software projects are analyzed; major and minor risks are highlighted. The most critical parameters have been singled out giving the possibility to estimate the occurrence of an adverse situations (project duration, the frequency of customer’s requirements changing, work deadlines, experience of developers’ participation in such projects and others.. The method of qualitative fuzzy description based on fuzzy logic has been developed for analysis of these parameters. Evaluation of possible situations and knowledge base formation rely on a survey of experts. The main limitations of existing automated systems have been identified in relation to their applicability to risk management in the software design. Theoretical research set the stage for software system that makes it possible to automate the risk management process for software projects. The developed software system automates the process of fuzzy inference in the following stages: rule base formation of the fuzzy inference systems, fuzzification of input variables, aggregation of sub-conditions, activation and accumulation of conclusions for fuzzy production rules, variables defuzzification. The result of risk management automation process in the software design is their quantitative and qualitative assessment and expert advice for their minimization. Practical significance of the work lies in the fact that implementation of the developed automated system gives the possibility for performance improvement of software projects.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference

Science.gov (United States)

Hall, Alex; Taylor, Andy

2017-06-01

We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
Optimization methods for logical inference

CERN Document Server

Chandru, Vijay

2011-01-01

Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
Evaluating ortholog prediction algorithms in a yeast model clade.

Directory of Open Access Journals (Sweden)

Leonidas Salichos

Full Text Available BACKGROUND: Accurate identification of orthologs is crucial for evolutionary studies and for functional annotation. Several algorithms have been developed for ortholog delineation, but so far, manually curated genome-scale biological databases of orthologous genes for algorithm evaluation have been lacking. We evaluated four popular ortholog prediction algorithms (MultiParanoid; and OrthoMCL; RBH: Reciprocal Best Hit; RSD: Reciprocal Smallest Distance; the last two extended into clustering algorithms cRBH and cRSD, respectively, so that they can predict orthologs across multiple taxa against a set of 2,723 groups of high-quality curated orthologs from 6 Saccharomycete yeasts in the Yeast Gene Order Browser. RESULTS: Examination of sensitivity [TP/(TP+FN], specificity [TN/(TN+FP], and accuracy [(TP+TN/(TP+TN+FP+FN] across a broad parameter range showed that cRBH was the most accurate and specific algorithm, whereas OrthoMCL was the most sensitive. Evaluation of the algorithms across a varying number of species showed that cRBH had the highest accuracy and lowest false discovery rate [FP/(FP+TP], followed by cRSD. Of the six species in our set, three descended from an ancestor that underwent whole genome duplication. Subsequent differential duplicate loss events in the three descendants resulted in distinct classes of gene loss patterns, including cases where the genes retained in the three descendants are paralogs, constituting 'traps' for ortholog prediction algorithms. We found that the false discovery rate of all algorithms dramatically increased in these traps. CONCLUSIONS: These results suggest that simple algorithms, like cRBH, may be better ortholog predictors than more complex ones (e.g., OrthoMCL and MultiParanoid for evolutionary and functional genomics studies where the objective is the accurate inference of single-copy orthologs (e.g., molecular phylogenetics, but that all algorithms fail to accurately predict orthologs when paralogy
Integration of Adaptive Neuro-Fuzzy Inference System, Neural Networks and Geostatistical Methods for Fracture Density Modeling

Directory of Open Access Journals (Sweden)

Ja’fari A.

2014-01-01

Full Text Available Image logs provide useful information for fracture study in naturally fractured reservoir. Fracture dip, azimuth, aperture and fracture density can be obtained from image logs and have great importance in naturally fractured reservoir characterization. Imaging all fractured parts of hydrocarbon reservoirs and interpreting the results is expensive and time consuming. In this study, an improved method to make a quantitative correlation between fracture densities obtained from image logs and conventional well log data by integration of different artificial intelligence systems was proposed. The proposed method combines the results of Adaptive Neuro-Fuzzy Inference System (ANFIS and Neural Networks (NN algorithms for overall estimation of fracture density from conventional well log data. A simple averaging method was used to obtain a better result by combining results of ANFIS and NN. The algorithm applied on other wells of the field to obtain fracture density. In order to model the fracture density in the reservoir, we used variography and sequential simulation algorithms like Sequential Indicator Simulation (SIS and Truncated Gaussian Simulation (TGS. The overall algorithm applied to Asmari reservoir one of the SW Iranian oil fields. Histogram analysis applied to control the quality of the obtained models. Results of this study show that for higher number of fracture facies the TGS algorithm works better than SIS but in small number of fracture facies both algorithms provide approximately same results.
A Randomized Cross‐over Study of High‐dose Metoclopramide plus Dexamethasone versus Granisetron plus Dexamethasone in Patients Receiving Chemotherapy with High‐dose Cisplatin

Science.gov (United States)

Eguchi, Kenji; Shinkai, Tetsu; Tamura, Tomohide; Ohe, Yuichiro; Nisio, Masato; Kunikane, Hiroshi; Arioka, Hitoshi; Karato, Atsuya; Nakashima, Hajime; Sasaki, Yasutsuna; Tajima, Kinuko; Tada, Noriko; Saijo, Nagahiro

1994-01-01

We carried out a randomized, single‐blind, cross‐over trial to compare the antiemetic effect, for both acute and delayed emesis, of granisetron plus dexamethasone (GRN+Dx) with that of high‐dose metoclopramide plus dexamethasone (HDMP + Dx). Fifty‐four patients with primary or metastatic lung cancer, given single‐dose cisplatin (> 80 mg/m2) chemotherapy more than twice, were enrolled in this study. They were treated with both HDMP+Dx and GRN+Dx in two consecutive chemotherapy courses. On day 1, patients experienced a mean of 2.5 (SD=4.3) and 0,1 (SD = 0.4) episodes of vomiting in the HDMP+Dx and the GRN + Dx groups, respectively (P=0.0008). Complete response rate on day 1 was 45 and 90% in the HDMP+Dx and the GRN+Dx groups, respectively (P= 0.0001). Patients treated with GRN+Dx had a tendency to suffer more episodes of vomiting than the HDMP+Dx group on days 2–5, but it was not statistically significant. Twenty‐four patients (57%) preferred the GRN+Dx treatment and 14 patients (33%), HDMP + Dx. In the HDMP + Dx group, nine patients (21%) had an extrapyramidal reaction, and 5 patients (12%) had constipation that lasted for at least two days. In contrast, no patients had extrapyramidal reactions, and IS patients (43%) had constipation in the GRN+Dx group (P < 0.01). GRN+Dx was more effective than HDMP+Dx only in preventing the acute emesis induced by cisplatin. An effective treatment for delayed emesis is still needed. PMID:7829401
Inference in `poor` languages

Energy Technology Data Exchange (ETDEWEB)

Petrov, S.

1996-10-01

Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.
Empirical Analysis of Stochastic Volatility Model by Hybrid Monte Carlo Algorithm

International Nuclear Information System (INIS)

Takaishi, Tetsuya

2013-01-01

The stochastic volatility model is one of volatility models which infer latent volatility of asset returns. The Bayesian inference of the stochastic volatility (SV) model is performed by the hybrid Monte Carlo (HMC) algorithm which is superior to other Markov Chain Monte Carlo methods in sampling volatility variables. We perform the HMC simulations of the SV model for two liquid stock returns traded on the Tokyo Stock Exchange and measure the volatilities of those stock returns. Then we calculate the accuracy of the volatility measurement using the realized volatility as a proxy of the true volatility and compare the SV model with the GARCH model which is one of other volatility models. Using the accuracy calculated with the realized volatility we find that empirically the SV model performs better than the GARCH model.
ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information.

Science.gov (United States)

Lachmann, Alexander; Giorgi, Federico M; Lopez, Gonzalo; Califano, Andrea

2016-07-15

The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space. Here, we present a completely new implementation of the algorithm, based on an Adaptive Partitioning strategy (AP) for estimating the Mutual Information. The new AP implementation (ARACNe-AP) achieves a dramatic improvement in computational performance (200× on average) over the previous methodology, while preserving the Mutual Information estimator and the Network inference accuracy of the original algorithm. Given that the previous version of ARACNe is extremely demanding, the new version of the algorithm will allow even researchers with modest computational resources to build complex regulatory networks from hundreds of gene expression profiles. A JAVA cross-platform command line executable of ARACNe, together with all source code and a detailed usage guide are freely available on Sourceforge (http://sourceforge.net/projects/aracne-ap). JAVA version 8 or higher is required. califano@c2b2.columbia.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
EI: A Program for Ecological Inference

Directory of Open Access Journals (Sweden)

Gary King

2004-09-01

Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.
A Multifactorial, Criteria-based Progressive Algorithm for Hamstring Injury Treatment.

Science.gov (United States)

Mendiguchia, Jurdan; Martinez-Ruiz, Enrique; Edouard, Pascal; Morin, Jean-Benoît; Martinez-Martinez, Francisco; Idoate, Fernando; Mendez-Villanueva, Alberto

2017-07-01

Given the prevalence of hamstring injuries in football, a rehabilitation program that effectively promotes muscle tissue repair and functional recovery is paramount to minimize reinjury risk and optimize player performance and availability. This study aimed to assess the concurrent effectiveness of administering an individualized and multifactorial criteria-based algorithm (rehabilitation algorithm [RA]) on hamstring injury rehabilitation in comparison with using a general rehabilitation protocol (RP). Implementing a double-blind randomized controlled trial approach, two equal groups of 24 football players (48 total) completed either an RA group or a validated RP group 5 d after an acute hamstring injury. Within 6 months after return to sport, six hamstring reinjuries occurred in RP versus one injury in RA (relative risk = 6, 90% confidence interval = 1-35; clinical inference: very likely beneficial effect). The average duration of return to sport was possibly quicker (effect size = 0.34 ± 0.42) in RP (23.2 ± 11.7 d) compared with RA (25.5 ± 7.8 d) (-13.8%, 90% confidence interval = -34.0% to 3.4%; clinical inference: possibly small effect). At the time to return to sport, RA players showed substantially better 10-m time, maximal sprinting speed, and greater mechanical variables related to speed (i.e., maximum theoretical speed and maximal horizontal power) than the RP. Although return to sport was slower, male football players who underwent an individualized, multifactorial, criteria-based algorithm with a performance- and primary risk factor-oriented training program from the early stages of the process markedly decreased the risk of reinjury compared with a general protocol where long-length strength training exercises were prioritized.
A Review of Intelligent Driving Style Analysis Systems and Related Artificial Intelligence Algorithms.

Science.gov (United States)

Meiring, Gys Albertus Marthinus; Myburgh, Hermanus Carel

2015-12-04

In this paper the various driving style analysis solutions are investigated. An in-depth investigation is performed to identify the relevant machine learning and artificial intelligence algorithms utilised in current driver behaviour and driving style analysis systems. This review therefore serves as a trove of information, and will inform the specialist and the student regarding the current state of the art in driver style analysis systems, the application of these systems and the underlying artificial intelligence algorithms applied to these applications. The aim of the investigation is to evaluate the possibilities for unique driver identification utilizing the approaches identified in other driver behaviour studies. It was found that Fuzzy Logic inference systems, Hidden Markov Models and Support Vector Machines consist of promising capabilities to address unique driver identification algorithms if model complexity can be reduced.
An Algorithm to Automate Yeast Segmentation and Tracking

Science.gov (United States)

Doncic, Andreas; Eser, Umut; Atay, Oguzhan; Skotheim, Jan M.

2013-01-01

Our understanding of dynamic cellular processes has been greatly enhanced by rapid advances in quantitative fluorescence microscopy. Imaging single cells has emphasized the prevalence of phenomena that can be difficult to infer from population measurements, such as all-or-none cellular decisions, cell-to-cell variability, and oscillations. Examination of these phenomena requires segmenting and tracking individual cells over long periods of time. However, accurate segmentation and tracking of cells is difficult and is often the rate-limiting step in an experimental pipeline. Here, we present an algorithm that accomplishes fully automated segmentation and tracking of budding yeast cells within growing colonies. The algorithm incorporates prior information of yeast-specific traits, such as immobility and growth rate, to segment an image using a set of threshold values rather than one specific optimized threshold. Results from the entire set of thresholds are then used to perform a robust final segmentation. PMID:23520484
On the criticality of inferred models

Science.gov (United States)

Mastromatteo, Iacopo; Marsili, Matteo

2011-10-01

Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.
On the criticality of inferred models

International Nuclear Information System (INIS)

Mastromatteo, Iacopo; Marsili, Matteo

2011-01-01

Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality

EIT image regularization by a new Multi-Objective Simulated Annealing algorithm.

Science.gov (United States)

Castro Martins, Thiago; Sales Guerra Tsuzuki, Marcos

2015-01-01

Multi-Objective Optimization can be used to produce regularized Electrical Impedance Tomography (EIT) images where the weight of the regularization term is not known a priori. This paper proposes a novel Multi-Objective Optimization algorithm based on Simulated Annealing tailored for EIT image reconstruction. Images are reconstructed from experimental data and compared with images from other Multi and Single Objective optimization methods. A significant performance enhancement from traditional techniques can be inferred from the results.
Inference

DEFF Research Database (Denmark)

Møller, Jesper

2010-01-01

Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
Feature Inference Learning and Eyetracking

Science.gov (United States)

Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.

2009-01-01

Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
Planning the FUSE Mission Using the SOVA Algorithm

Science.gov (United States)

Lanzi, James; Heatwole, Scott; Ward, Philip R.; Civeit, Thomas; Calvani, Humberto; Kruk, Jeffrey W.; Suchkov, Anatoly

2011-01-01

Three documents discuss the Sustainable Objective Valuation and Attainability (SOVA) algorithm and software as used to plan tasks (principally, scientific observations and associated maneuvers) for the Far Ultraviolet Spectroscopic Explorer (FUSE) satellite. SOVA is a means of managing risk in a complex system, based on a concept of computing the expected return value of a candidate ordered set of tasks as a product of pre-assigned task values and assessments of attainability made against qualitatively defined strategic objectives. For the FUSE mission, SOVA autonomously assembles a week-long schedule of target observations and associated maneuvers so as to maximize the expected scientific return value while keeping the satellite stable, managing the angular momentum of spacecraft attitude- control reaction wheels, and striving for other strategic objectives. A six-degree-of-freedom model of the spacecraft is used in simulating the tasks, and the attainability of a task is calculated at each step by use of strategic objectives as defined by use of fuzzy inference systems. SOVA utilizes a variant of a graph-search algorithm known as the A* search algorithm to assemble the tasks into a week-long target schedule, using the expected scientific return value to guide the search.
Multimodal FMRI resting-state functional connectivity in granulin mutations: the case of fronto-parietal dementia.

Directory of Open Access Journals (Sweden)

Enrico Premi

Full Text Available BACKGROUND: Monogenic dementias represent a great opportunity to trace disease progression from preclinical to symptomatic stages. Frontotemporal Dementia related to Granulin (GRN mutations presents a specific framework of brain damage, involving fronto-temporal regions and long inter-hemispheric white matter bundles. Multimodal resting-state functional MRI (rs-fMRI is a promising tool to carefully describe disease signature from the earliest disease phase. OBJECTIVE: To define local connectivity alterations in GRN related pathology moving from the presymptomatic (asymptomatic GRN mutation carriers to the clinical phase of the disease (GRN- related Frontotemporal Dementia. METHODS: Thirty-one GRN Thr272fs mutation carriers (14 patients with Frontotemporal Dementia and 17 asymptomatic carriers and 38 healthy controls were recruited. Local connectivity measures (Regional Homogeneity (ReHo, Fractional Amplitude of Low Frequency Fluctuation (fALFF and Degree Centrality (DC were computed, considering age and gender as nuisance variables as well as the influence of voxel-level gray matter atrophy. RESULTS: Asymptomatic GRN carriers had selective reduced ReHo in the left parietal region and increased ReHo in frontal regions compared to healthy controls. Considering Frontotemporal Dementia patients, all measures (ReHo, fALFF and DC were reduced in inferior parietal, frontal lobes and posterior cingulate cortex. Considering GRN mutation carriers, an inverse correlation with age in the posterior cingulate cortex, inferior parietal lobule and orbitofrontal cortex was found. CONCLUSIONS: GRN pathology is characterized by functional brain network alterations even decades before the clinical onset; they involve the parietal region primarily and then spread to the anterior regions of the brain, supporting the concept of molecular nexopathies.
Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana

KAUST Repository

Muraro, D.

2013-01-01

Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation. © 2004-2012 IEEE.
Sensitivity to neurotoxic stress is not increased in progranulin-deficient mice.

Science.gov (United States)

Petkau, Terri L; Zhu, Shanshan; Lu, Ge; Fernando, Sarah; Cynader, Max; Leavitt, Blair R

2013-11-01

Loss-of-function mutations in the progranulin (GRN) gene are a common cause of autosomal dominant frontotemporal lobar degeneration, a fatal and progressive neurodegenerative disorder common in people less than 65 years of age. In the brain, progranulin is expressed in multiple regions at varying levels, and has been hypothesized to play a neuroprotective or neurotrophic role. Four neurotoxic agents were injected in vivo into constitutive progranulin knockout (Grn(-/-)) mice and their wild-type (Grn(+/+)) counterparts to assess neuronal sensitivity to toxic stress. Administration of 3-nitropropionic acid, quinolinic acid, kainic acid, and pilocarpine induced robust and measurable neuronal cell death in affected brain regions, but no differential cell death was observed between Grn(+/+) and Grn(-/-) mice. Thus, constitutive progranulin knockout mice do not have increased sensitivity to neuronal cell death induced by the acute chemical models of neuronal injury used in this study. Copyright © 2013. Published by Elsevier Inc.
Forward and backward inference in spatial cognition.

Directory of Open Access Journals (Sweden)

Will D Penny

Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
Inferring Group Processes from Computer-Mediated Affective Text Analysis

Energy Technology Data Exchange (ETDEWEB)

Schryver, Jack C [ORNL; Begoli, Edmon [ORNL; Jose, Ajith [Missouri University of Science and Technology; Griffin, Christopher [Pennsylvania State University

2011-02-01

Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Several useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers.
Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors

International Nuclear Information System (INIS)

Lucka, Felix

2012-01-01

Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)
Human disease MiRNA inference by combining target information based on heterogeneous manifolds.

Science.gov (United States)

Ding, Pingjian; Luo, Jiawei; Liang, Cheng; Xiao, Qiu; Cao, Buwen

2018-04-01

The emergence of network medicine has provided great insight into the identification of disease-related molecules, which could help with the development of personalized medicine. However, the state-of-the-art methods could neither simultaneously consider target information and the known miRNA-disease associations nor effectively explore novel gene-disease associations as a by-product during the process of inferring disease-related miRNAs. Computational methods incorporating multiple sources of information offer more opportunities to infer disease-related molecules, including miRNAs and genes in heterogeneous networks at a system level. In this study, we developed a novel algorithm, named inference of Disease-related MiRNAs based on Heterogeneous Manifold (DMHM), to accurately and efficiently identify miRNA-disease associations by integrating multi-omics data. Graph-based regularization was utilized to obtain a smooth function on the data manifold, which constitutes the main principle of DMHM. The novelty of this framework lies in the relatedness between diseases and miRNAs, which are measured via heterogeneous manifolds on heterogeneous networks integrating target information. To demonstrate the effectiveness of DMHM, we conducted comprehensive experiments based on HMDD datasets and compared DMHM with six state-of-the-art methods. Experimental results indicated that DMHM significantly outperformed the other six methods under fivefold cross validation and de novo prediction tests. Case studies have further confirmed the practical usefulness of DMHM. Copyright © 2018 Elsevier Inc. All rights reserved.
Exact Algorithms for Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

Science.gov (United States)

Kordi, Misagh; Bansal, Mukul S

2017-06-01

Duplication-Transfer-Loss (DTL) reconciliation is a powerful method for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation seeks to reconcile gene trees with species trees by postulating speciation, duplication, transfer, and loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. In practice, however, gene trees are often non-binary due to uncertainty in the gene tree topologies, and DTL reconciliation with non-binary gene trees is known to be NP-hard. In this paper, we present the first exact algorithms for DTL reconciliation with non-binary gene trees. Specifically, we (i) show that the DTL reconciliation problem for non-binary gene trees is fixed-parameter tractable in the maximum degree of the gene tree, (ii) present an exponential-time, but in-practice efficient, algorithm to track and enumerate all optimal binary resolutions of a non-binary input gene tree, and (iii) apply our algorithms to a large empirical data set of over 4700 gene trees from 100 species to study the impact of gene tree uncertainty on DTL-reconciliation and to demonstrate the applicability and utility of our algorithms. The new techniques and algorithms introduced in this paper will help biologists avoid incorrect evolutionary inferences caused by gene tree uncertainty.
A formal model of interpersonal inference

Directory of Open Access Journals (Sweden)

Michael eMoutoussis

2014-03-01

Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.
Expression of the growth factor progranulin in endothelial cells influences growth and development of blood vessels: a novel mouse model.

Science.gov (United States)

Toh, Huishi; Cao, Mingju; Daniels, Eugene; Bateman, Andrew

2013-01-01

Progranulin is a secreted glycoprotein that regulates cell proliferation, migration and survival. It has roles in development, tumorigenesis, wound healing, neurodegeneration and inflammation. Endothelia in tumors, wounds and placenta express elevated levels of progranulin. In culture, progranulin activates endothelial proliferation and migration. This suggested that progranulin might regulate angiogenesis. It was, however, unclear how elevated endothelial progranulin levels influence vascular growth in vivo. To address this issue, we generated mice with progranulin expression targeted specifically to developing endothelial cells using a Tie2-promoter/enhancer construct. Three Tie2-Grn mouse lines were generated with varying Tie2-Grn copy number, and were called GrnLo, GrnMid, and GrnHi. All three lines showed increased mortality that correlates with Tie2-Grn copy number, with greatest mortality and lowest germline transmission in the GrnHi line. Death of the transgenic animals occurred around birth, and continued for three days after birth. Those that survived beyond day 3 survived into adulthood. Transgenic neonates that died showed vascular abnormalities of varying severity. Some exhibited bleeding into body cavities such as the pericardial space. Smaller localized hemorrhages were seen in many organs. Blood vessels were often dilated and thin-walled. To establish the development of these abnormalities, we examined mice at early (E10.5-14.5) and later (E15.5-17.5) developmental phases. Early events during vasculogenesis appear unaffected by Tie2-Grn as apparently normal primary vasculature had been established at E10.5. The earliest onset of vascular abnormality was at E15.5, with focal cerebral hemorrhage and enlarged vessels in various organs. Aberrant Tie2-Grn positive vessels showed thinning of the basement membrane and reduced investiture with mural cells. We conclude that progranulin promotes exaggerated vessel growth in vivo, with subsequent effects in
Expression of the growth factor progranulin in endothelial cells influences growth and development of blood vessels: a novel mouse model.

Directory of Open Access Journals (Sweden)

Huishi Toh

Full Text Available Progranulin is a secreted glycoprotein that regulates cell proliferation, migration and survival. It has roles in development, tumorigenesis, wound healing, neurodegeneration and inflammation. Endothelia in tumors, wounds and placenta express elevated levels of progranulin. In culture, progranulin activates endothelial proliferation and migration. This suggested that progranulin might regulate angiogenesis. It was, however, unclear how elevated endothelial progranulin levels influence vascular growth in vivo. To address this issue, we generated mice with progranulin expression targeted specifically to developing endothelial cells using a Tie2-promoter/enhancer construct. Three Tie2-Grn mouse lines were generated with varying Tie2-Grn copy number, and were called GrnLo, GrnMid, and GrnHi. All three lines showed increased mortality that correlates with Tie2-Grn copy number, with greatest mortality and lowest germline transmission in the GrnHi line. Death of the transgenic animals occurred around birth, and continued for three days after birth. Those that survived beyond day 3 survived into adulthood. Transgenic neonates that died showed vascular abnormalities of varying severity. Some exhibited bleeding into body cavities such as the pericardial space. Smaller localized hemorrhages were seen in many organs. Blood vessels were often dilated and thin-walled. To establish the development of these abnormalities, we examined mice at early (E10.5-14.5 and later (E15.5-17.5 developmental phases. Early events during vasculogenesis appear unaffected by Tie2-Grn as apparently normal primary vasculature had been established at E10.5. The earliest onset of vascular abnormality was at E15.5, with focal cerebral hemorrhage and enlarged vessels in various organs. Aberrant Tie2-Grn positive vessels showed thinning of the basement membrane and reduced investiture with mural cells. We conclude that progranulin promotes exaggerated vessel growth in vivo, with
Distributional Inference

NARCIS (Netherlands)

Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.

1995-01-01

The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is
A Review of Intelligent Driving Style Analysis Systems and Related Artificial Intelligence Algorithms

Directory of Open Access Journals (Sweden)

Gys Albertus Marthinus Meiring

2015-12-01

Full Text Available In this paper the various driving style analysis solutions are investigated. An in-depth investigation is performed to identify the relevant machine learning and artificial intelligence algorithms utilised in current driver behaviour and driving style analysis systems. This review therefore serves as a trove of information, and will inform the specialist and the student regarding the current state of the art in driver style analysis systems, the application of these systems and the underlying artificial intelligence algorithms applied to these applications. The aim of the investigation is to evaluate the possibilities for unique driver identification utilizing the approaches identified in other driver behaviour studies. It was found that Fuzzy Logic inference systems, Hidden Markov Models and Support Vector Machines consist of promising capabilities to address unique driver identification algorithms if model complexity can be reduced.
Intelligent Diagnostic Assistant for Complicated Skin Diseases through C5's Algorithm.

Science.gov (United States)

Jeddi, Fatemeh Rangraz; Arabfard, Masoud; Kermany, Zahra Arab

2017-09-01

Intelligent Diagnostic Assistant can be used for complicated diagnosis of skin diseases, which are among the most common causes of disability. The aim of this study was to design and implement a computerized intelligent diagnostic assistant for complicated skin diseases through C5's Algorithm. An applied-developmental study was done in 2015. Knowledge base was developed based on interviews with dermatologists through questionnaires and checklists. Knowledge representation was obtained from the train data in the database using Excel Microsoft Office. Clementine Software and C5's Algorithms were applied to draw the decision tree. Analysis of test accuracy was performed based on rules extracted using inference chains. The rules extracted from the decision tree were entered into the CLIPS programming environment and the intelligent diagnostic assistant was designed then. The rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE. The accuracy and error rates obtained in the training phase from the decision tree were 99.56% and 0.44%, respectively. The accuracy of the decision tree was 98% and the error was 2% in the test phase. Intelligent diagnostic assistant can be used as a reliable system with high accuracy, sensitivity, specificity, and agreement.
Continuous Integrated Invariant Inference, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...
A practical exact maximum compatibility algorithm for reconstruction of recent evolutionary history.

Science.gov (United States)

Cherry, Joshua L

2017-02-23

Maximum compatibility is a method of phylogenetic reconstruction that is seldom applied to molecular sequences. It may be ideal for certain applications, such as reconstructing phylogenies of closely-related bacteria on the basis of whole-genome sequencing. Here I present an algorithm that rapidly computes phylogenies according to a compatibility criterion. Although based on solutions to the maximum clique problem, this algorithm deals properly with ambiguities in the data. The algorithm is applied to bacterial data sets containing up to nearly 2000 genomes with several thousand variable nucleotide sites. Run times are several seconds or less. Computational experiments show that maximum compatibility is less sensitive than maximum parsimony to the inclusion of nucleotide data that, though derived from actual sequence reads, has been identified as likely to be misleading. Maximum compatibility is a useful tool for certain phylogenetic problems, such as inferring the relationships among closely-related bacteria from whole-genome sequence data. The algorithm presented here rapidly solves fairly large problems of this type, and provides robustness against misleading characters than can pollute large-scale sequencing data.

Estimating uncertainty of inference for validation

Energy Technology Data Exchange (ETDEWEB)

Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM

2010-09-30

We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the
Development of algorithms for building inventory compilation through remote sensing and statistical inferencing

Science.gov (United States)

Sarabandi, Pooya

economical way. A terrain-dependent-search algorithm is formulated to facilitate the search for correspondences in a quasi-stereo pair of images. The calculated heights for sample buildings using cross-sensor data fusion algorithm show an average coefficient of variation 1.03%. In order to infer structural-type and occupancy-type, i.e. engineering attributes, of buildings from spatial and geometric attributes of 3-D models, a statistical data analysis framework is formulated. Applications of "Classification Trees" and "Multinomial Logistic Models" in modeling the marginal probabilities of class-membership of engineering attributes are investigated. Adaptive statistical models to incorporate different spatial and geometric attributes of buildings---while inferring the engineering attributes---are developed in this dissertation. The inferred engineering attributes in conjunction with the spatial and geometric attributes derived from the imagery can be used to augment regional building inventories and therefore enhance the result of catastrophe models. In the last part of the dissertation, a set of empirically-derived motion-damage relationships based on the correlation of observed building performance with measured ground-motion parameters from 1994 Northridge and 1999 Chi-Chi Taiwan earthquakes are developed. Fragility functions in the form of cumulative lognormal distributions and damage probability matrices for several classes of buildings (wood, steel and concrete), as well as number of ground-motion intensity measures are developed and compared to currently-used motion-damage relationships.
Space-Time Joint Interference Cancellation Using Fuzzy-Inference-Based Adaptive Filtering Techniques in Frequency-Selective Multipath Channels

Science.gov (United States)

Hu, Chia-Chang; Lin, Hsuan-Yu; Chen, Yu-Fan; Wen, Jyh-Horng

2006-12-01

An adaptive minimum mean-square error (MMSE) array receiver based on the fuzzy-logic recursive least-squares (RLS) algorithm is developed for asynchronous DS-CDMA interference suppression in the presence of frequency-selective multipath fading. This receiver employs a fuzzy-logic control mechanism to perform the nonlinear mapping of the squared error and squared error variation, denoted by ([InlineEquation not available: see fulltext.],[InlineEquation not available: see fulltext.]), into a forgetting factor[InlineEquation not available: see fulltext.]. For the real-time applicability, a computationally efficient version of the proposed receiver is derived based on the least-mean-square (LMS) algorithm using the fuzzy-inference-controlled step-size[InlineEquation not available: see fulltext.]. This receiver is capable of providing both fast convergence/tracking capability as well as small steady-state misadjustment as compared with conventional LMS- and RLS-based MMSE DS-CDMA receivers. Simulations show that the fuzzy-logic LMS and RLS algorithms outperform, respectively, other variable step-size LMS (VSS-LMS) and variable forgetting factor RLS (VFF-RLS) algorithms at least 3 dB and 1.5 dB in bit-error-rate (BER) for multipath fading channels.
An algorithm to automate yeast segmentation and tracking.

Directory of Open Access Journals (Sweden)

Andreas Doncic

Full Text Available Our understanding of dynamic cellular processes has been greatly enhanced by rapid advances in quantitative fluorescence microscopy. Imaging single cells has emphasized the prevalence of phenomena that can be difficult to infer from population measurements, such as all-or-none cellular decisions, cell-to-cell variability, and oscillations. Examination of these phenomena requires segmenting and tracking individual cells over long periods of time. However, accurate segmentation and tracking of cells is difficult and is often the rate-limiting step in an experimental pipeline. Here, we present an algorithm that accomplishes fully automated segmentation and tracking of budding yeast cells within growing colonies. The algorithm incorporates prior information of yeast-specific traits, such as immobility and growth rate, to segment an image using a set of threshold values rather than one specific optimized threshold. Results from the entire set of thresholds are then used to perform a robust final segmentation.
Research on Inferring ELECTRE-III’s Parameters with Fuzzy information and A Case on Naval Gun Weapon System Integration

Directory of Open Access Journals (Sweden)

Sun Shi Yan

2016-01-01

Full Text Available Multiple attributes decision making (MADM method is an important measure for system integration. Robustness analysis on MADM is a hotspot in these years which wins academe’s great attention, and is supposed to be an effective way when countering imperfect information. Setting parameters in ELECTRE-III’s is a vital and difficult step. In this paper, a method of inferring ELECTRE-III’s parameters with fuzzy information based on robustness analysis is presented. First, ELECTRE-III is transformed into a continuous smooth function of each parameter vector. Then, robustness analysis structure and a parameters inferring algorithm are provided by maximizing robustness margin based on mathematics programming. Moreover, how to resolve the programming problem is also discussed. At last, a illustrative example of Naval Gun Weapon System Integration is put forward.
An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems

KAUST Repository

Zenil, Hector

2017-09-08

We introduce a conceptual framework and an interventional calculus to steer and manipulate systems based on their intrinsic algorithmic probability using the universal principles of the theory of computability and algorithmic information. By applying sequences of controlled interventions to systems and networks, we estimate how changes in their algorithmic information content are reflected in positive/negative shifts towards and away from randomness. The strong connection between approximations to algorithmic complexity (the size of the shortest generating mechanism) and causality induces a sequence of perturbations ranking the network elements by the steering capabilities that each of them is capable of. This new dimension unmasks a separation between causal and non-causal components providing a suite of powerful parameter-free algorithms of wide applicability ranging from optimal dimension reduction, maximal randomness analysis and system control. We introduce methods for reprogramming systems that do not require the full knowledge or access to the system\\'s actual kinetic equations or any probability distributions. A causal interventional analysis of synthetic and regulatory biological networks reveals how the algorithmic reprogramming qualitatively reshapes the system\\'s dynamic landscape. For example, during cellular differentiation we find a decrease in the number of elements corresponding to a transition away from randomness and a combination of the system\\'s intrinsic properties and its intrinsic capabilities to be algorithmically reprogrammed can reconstruct an epigenetic landscape. The interventional calculus is broadly applicable to predictive causal inference of systems such as networks and of relevance to a variety of machine and causal learning techniques driving model-based approaches to better understanding and manipulate complex systems.
Application of maximum entropy to statistical inference for inversion of data from a single track segment.

Science.gov (United States)

Stotts, Steven A; Koch, Robert A

2017-08-01

In this paper an approach is presented to estimate the constraint required to apply maximum entropy (ME) for statistical inference with underwater acoustic data from a single track segment. Previous algorithms for estimating the ME constraint require multiple source track segments to determine the constraint. The approach is relevant for addressing model mismatch effects, i.e., inaccuracies in parameter values determined from inversions because the propagation model does not account for all acoustic processes that contribute to the measured data. One effect of model mismatch is that the lowest cost inversion solution may be well outside a relatively well-known parameter value's uncertainty interval (prior), e.g., source speed from track reconstruction or towed source levels. The approach requires, for some particular parameter value, the ME constraint to produce an inferred uncertainty interval that encompasses the prior. Motivating this approach is the hypothesis that the proposed constraint determination procedure would produce a posterior probability density that accounts for the effect of model mismatch on inferred values of other inversion parameters for which the priors might be quite broad. Applications to both measured and simulated data are presented for model mismatch that produces minimum cost solutions either inside or outside some priors.
Exploiting visual search theory to infer social interactions

Science.gov (United States)

Rota, Paolo; Dang-Nguyen, Duc-Tien; Conci, Nicola; Sebe, Nicu

2013-03-01

In this paper we propose a new method to infer human social interactions using typical techniques adopted in literature for visual search and information retrieval. The main piece of information we use to discriminate among different types of interactions is provided by proxemics cues acquired by a tracker, and used to distinguish between intentional and casual interactions. The proxemics information has been acquired through the analysis of two different metrics: on the one hand we observe the current distance between subjects, and on the other hand we measure the O-space synergy between subjects. The obtained values are taken at every time step over a temporal sliding window, and processed in the Discrete Fourier Transform (DFT) domain. The features are eventually merged into an unique array, and clustered using the K-means algorithm. The clusters are reorganized using a second larger temporal window into a Bag Of Words framework, so as to build the feature vector that will feed the SVM classifier.
TALE factors use two distinct functional modes to control an essential zebrafish gene expression program.

Science.gov (United States)

Ladam, Franck; Stanney, William; Donaldson, Ian J; Yildiz, Ozge; Bobola, Nicoletta; Sagerström, Charles G

2018-06-18

TALE factors are broadly expressed embryonically and known to function in complexes with transcription factors (TFs) like Hox proteins at gastrula/segmentation stages, but it is unclear if such generally expressed factors act by the same mechanism throughout embryogenesis. We identify a TALE-dependent gene regulatory network (GRN) required for anterior development and detect TALE occupancy associated with this GRN throughout embryogenesis. At blastula stages, we uncover a novel functional mode for TALE factors, where they occupy genomic DECA motifs with nearby NF-Y sites. We demonstrate that TALE and NF-Y form complexes and regulate chromatin state at genes of this GRN. At segmentation stages, GRN-associated TALE occupancy expands to include HEXA motifs near PBX:HOX sites. Hence, TALE factors control a key GRN, but utilize distinct DNA motifs and protein partners at different stages - a strategy that may also explain their oncogenic potential and may be employed by other broadly expressed TFs. © 2018, Ladam et al.
Applications of machine-learning algorithms for infrared colour selection of Galactic Wolf-Rayet stars

Science.gov (United States)

Morello, Giuseppe; Morris, P. W.; Van Dyk, S. D.; Marston, A. P.; Mauerhan, J. C.

2018-01-01

We have investigated and applied machine-learning algorithms for infrared colour selection of Galactic Wolf-Rayet (WR) candidates. Objects taken from the Spitzer Galactic Legacy Infrared Midplane Survey Extraordinaire (GLIMPSE) catalogue of the infrared objects in the Galactic plane can be classified into different stellar populations based on the colours inferred from their broad-band photometric magnitudes [J, H and Ks from 2 Micron All Sky Survey (2MASS), and the four Spitzer/IRAC bands]. The algorithms tested in this pilot study are variants of the k-nearest neighbours approach, which is ideal for exploratory studies of classification problems where interrelations between variables and classes are complicated. The aims of this study are (1) to provide an automated tool to select reliable WR candidates and potentially other classes of objects, (2) to measure the efficiency of infrared colour selection at performing these tasks and (3) to lay the groundwork for statistically inferring the total number of WR stars in our Galaxy. We report the performance results obtained over a set of known objects and selected candidates for which we have carried out follow-up spectroscopic observations, and confirm the discovery of four new WR stars.
Quantum-Like Representation of Non-Bayesian Inference

Science.gov (United States)

Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.

2013-01-01

This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Statistical inference an integrated Bayesianlikelihood approach

CERN Document Server

Aitkin, Murray

2010-01-01

Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
TMEM106B gene polymorphism is associated with age at onset in granulin mutation carriers and plasma granulin protein levels

Science.gov (United States)

Cruchaga, Carlos; Graff, Caroline; Chiang, Huei-Hsin; Wang, Jun; Hinrichs, Anthony L.; Spiegel, Noah; Bertelsen, Sarah; Mayo, Kevin; Norton, Joanne B.; Morris, John C.; Goate, Alison

2011-01-01

Objective A recent genome-wide association study for frontotemporal lobar degeneration with TAR DNA-binding protein inclusions (FTLD-TDP), identified rs1990622 (TMEM106B) as a risk factor for FTLD-TDP. In this study we tested whether rs1990622 is associated with age at onset (AAO) in granulin (GRN) mutation carriers and with plasma GRN levels in mutation carriers and healthy elderly individuals. Design Rs1990622 was genotyped in GRN mutation carriers and tested for association with AAO using the Kaplan-Meier and a Cox proportional hazards model. Subjects We analyzed 50 affected and unaffected GRN mutation carriers from four previously reported FTLD-TDP families (HDDD1, FD1, HDDD2 and the Karolinska family). GRN plasma levels were also measured in 73 healthy, elderly individuals. Results The risk allele of rs1990622 is associated with a mean decrease of the age at onset of thirteen years (p=9.9×10−7), with lower plasma granulin levels in both healthy older adults (p = 4×10−4) and GRN mutation carriers (p=0.0027). Analysis of the HAPMAP database identified a non-synonymous single nucleotide polymorphism, rs3173615 (T185S) in perfect linkage disequilibrium with rs1990622. Conclusions The association of rs1990622 with AAO explains, in part, the wide range in the age at onset of disease among GRN mutation carriers. We hypothesize that rs1990622 or another variant in linkage disequilibrium could act in a manner similar to APOE in Alzheimer’s disease, increasing risk for disease in the general population and modifying AAO in mutation carriers. Our results also suggest that genetic variation in TMEM106B may influence risk for FTLD-TDP by modulating secreted levels of GRN. PMID:21220649
Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions

Directory of Open Access Journals (Sweden)

Balding David J

2008-12-01

Full Text Available Abstract Background The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e.g. trisomy 21; Down's syndrome, and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV, arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences. Results In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate. Conclusion With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.
Reconstruction of elongated bubbles fusing the information from multiple optical probes through a Bayesian inference technique

Energy Technology Data Exchange (ETDEWEB)

Chakraborty, Shubhankar; Das, Prasanta Kr., E-mail: pkd@mech.iitkgp.ernet.in [Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India); Roy Chaudhuri, Partha [Department of Physics, Indian Institute of Technology Kharagpur, Kharagpur 721302 (India)

2016-07-15

In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Information-Theoretic Inference of Common Ancestors

Directory of Open Access Journals (Sweden)

Bastian Steudel

2015-04-01

Full Text Available A directed acyclic graph (DAG partially represents the conditional independence structure among observations of a system if the local Markov condition holds, that is if every variable is independent of its non-descendants given its parents. In general, there is a whole class of DAGs that represents a given set of conditional independence relations. We are interested in properties of this class that can be derived from observations of a subsystem only. To this end, we prove an information-theoretic inequality that allows for the inference of common ancestors of observed parts in any DAG representing some unknown larger system. More explicitly, we show that a large amount of dependence in terms of mutual information among the observations implies the existence of a common ancestor that distributes this information. Within the causal interpretation of DAGs, our result can be seen as a quantitative extension of Reichenbach’s principle of common cause to more than two variables. Our conclusions are valid also for non-probabilistic observations, such as binary strings, since we state the proof for an axiomatized notion of “mutual information” that includes the stochastic as well as the algorithmic version.
Transitioning Existing Content: inferring organisation-specific documents

Directory of Open Access Journals (Sweden)

Arijit Sengupta

2000-11-01

Full Text Available A definition for a document type within an organization represents an organizational norm about the way the organizational actors represent products and supporting evidence of organizational processes. Generating a good organization-specific document structure is, therefore, important since it can capture a shared understanding among the organizational actors about how certain business processes should be performed. Current tools that generate document type definitions focus on the underlying technology, emphasizing tags created in a single instance document. The tools, thus, fall short of capturing the shared understanding between organizational actors about how a given document type should be represented. We propose a method for inferring organization-specific document structures using multiple instance documents as inputs. The method consists of heuristics that combine individual document definitions, which may have been compiled using standard algorithms. We propose a number of heuristics utilizing artificial intelligence and natural language processing techniques. As the research progresses, the heuristics will be tested on a suite of test cases representing multiple instance documents for different document types. The complete methodology will be implemented as a research prototype
Identifying noncoding risk variants using disease-relevant gene regulatory networks.

Science.gov (United States)

Gao, Long; Uzun, Yasin; Gao, Peng; He, Bing; Ma, Xiaoke; Wang, Jiahui; Han, Shizhong; Tan, Kai

2018-02-16

Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.
Inference Attacks and Control on Database Structures

Directory of Open Access Journals (Sweden)

Muhamed Turkanovic

2015-02-01

Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.
Type Inference with Inequalities

DEFF Research Database (Denmark)

Schwartzbach, Michael Ignatieff

1991-01-01

of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...

Nitrate leaching from a potato field using fuzzy inference system combined with genetic algorithm

DEFF Research Database (Denmark)

Shekofteh, Hosein; Afyuni, Majid M; Hajabbasi, Mohammad-Ali

2012-01-01

in MFIS were tuned by Genetic Algorithm. The correlation coefficient, normalized root mean square error and relative mean absolute error percentage between the data obtained by HYDRUS-2D and the estimated values using MFIS model were 0.986, 0.086 and 2.38 respectively. It appears that MFIS can predict......The conventional application of nitrogen fertilizers via irrigation is likely to be responsible for the increased nitrate concentration in groundwater of areas dominated by irrigated agriculture. This requires appropriate water and nutrient management to minimize groundwater pollution...
Fast noise level estimation algorithm based on principal component analysis transform and nonlinear rectification

Science.gov (United States)

Xu, Shaoping; Zeng, Xiaoxia; Jiang, Yinnan; Tang, Yiling

2018-01-01

We proposed a noniterative principal component analysis (PCA)-based noise level estimation (NLE) algorithm that addresses the problem of estimating the noise level with a two-step scheme. First, we randomly extracted a number of raw patches from a given noisy image and took the smallest eigenvalue of the covariance matrix of the raw patches as the preliminary estimation of the noise level. Next, the final estimation was directly obtained with a nonlinear mapping (rectification) function that was trained on some representative noisy images corrupted with different known noise levels. Compared with the state-of-art NLE algorithms, the experiment results show that the proposed NLE algorithm can reliably infer the noise level and has robust performance over a wide range of image contents and noise levels, showing a good compromise between speed and accuracy in general.
Inference in models with adaptive learning

NARCIS (Netherlands)

Chevillon, G.; Massmann, M.; Mavroeidis, S.

2010-01-01

Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be
On the Hardness of Topology Inference

Science.gov (United States)

Acharya, H. B.; Gouda, M. G.

Many systems require information about the topology of networks on the Internet, for purposes like management, efficiency, testing of new protocols and so on. However, ISPs usually do not share the actual topology maps with outsiders; thus, in order to obtain the topology of a network on the Internet, a system must reconstruct it from publicly observable data. The standard method employs traceroute to obtain paths between nodes; next, a topology is generated such that the observed paths occur in the graph. However, traceroute has the problem that some routers refuse to reveal their addresses, and appear as anonymous nodes in traces. Previous research on the problem of topology inference with anonymous nodes has demonstrated that it is at best NP-complete. In this paper, we improve upon this result. In our previous research, we showed that in the special case where nodes may be anonymous in some traces but not in all traces (so all node identifiers are known), there exist trace sets that are generable from multiple topologies. This paper extends our theory of network tracing to the general case (with strictly anonymous nodes), and shows that the problem of computing the network that generated a trace set, given the trace set, has no general solution. The weak version of the problem, which allows an algorithm to output a "small" set of networks- any one of which is the correct one- is also not solvable. Any algorithm guaranteed to output the correct topology outputs at least an exponential number of networks. Our results are surprisingly robust: they hold even when the network is known to have exactly two anonymous nodes, and every node as well as every edge in the network is guaranteed to occur in some trace. On the basis of this result, we suggest that exact reconstruction of network topology requires more powerful tools than traceroute.
Final Report: Sampling-Based Algorithms for Estimating Structure in Big Data.

Energy Technology Data Exchange (ETDEWEB)

Matulef, Kevin Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-02-01

The purpose of this project was to develop sampling-based algorithms to discover hidden struc- ture in massive data sets. Inferring structure in large data sets is an increasingly common task in many critical national security applications. These data sets come from myriad sources, such as network traffic, sensor data, and data generated by large-scale simulations. They are often so large that traditional data mining techniques are time consuming or even infeasible. To address this problem, we focus on a class of algorithms that do not compute an exact answer, but instead use sampling to compute an approximate answer using fewer resources. The particular class of algorithms that we focus on are streaming algorithms , so called because they are designed to handle high-throughput streams of data. Streaming algorithms have only a small amount of working storage - much less than the size of the full data stream - so they must necessarily use sampling to approximate the correct answer. We present two results: * A streaming algorithm called HyperHeadTail , that estimates the degree distribution of a graph (i.e., the distribution of the number of connections for each node in a network). The degree distribution is a fundamental graph property, but prior work on estimating the degree distribution in a streaming setting was impractical for many real-world application. We improve upon prior work by developing an algorithm that can handle streams with repeated edges, and graph structures that evolve over time. * An algorithm for the task of maintaining a weighted subsample of items in a stream, when the items must be sampled according to their weight, and the weights are dynamically changing. To our knowledge, this is the first such algorithm designed for dynamically evolving weights. We expect it may be useful as a building block for other streaming algorithms on dynamic data sets.
Edge Detection Algorithm Based on Fuzzy Logic Theory for a Local Vision System of Robocup Humanoid League

Directory of Open Access Journals (Sweden)

Andrea K. Perez-Hernandez

2013-06-01

Full Text Available At this paper we shown the development of an algorithm to perform edges extraction based on fuzzy logic theory. This method allows recognizing landmarks on the game field for Humanoid League of RoboCup. The proposed algorithm describes the creation of a fuzzy inference system that permit evaluate the existent relationship between image pixels, finding variations on grey levels of related neighbor pixels. Subsequently, it shows an implementation of OTSU method to binarize an image that was obtained from fuzzy process and so generate an image containing only extracted edges, validating the algorithm with Humanoid League images. Later, we analyze obtained results that evidence a good performance of algorithm, considering that this proposal only takes an extra 35% processing time that will be required by traditional methods, whereas extracted edges are 52% less noise susceptible.
Fiducial inference - A Neyman-Pearson interpretation

NARCIS (Netherlands)

Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R

1999-01-01

Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial
Uncertainty in prediction and in inference

NARCIS (Netherlands)

Hilgevoord, J.; Uffink, J.

1991-01-01

The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in
Efficient Maximum Likelihood Estimation for Pedigree Data with the Sum-Product Algorithm.

Science.gov (United States)

Engelhardt, Alexander; Rieger, Anna; Tresch, Achim; Mansmann, Ulrich

2016-01-01

We analyze data sets consisting of pedigrees with age at onset of colorectal cancer (CRC) as phenotype. The occurrence of familial clusters of CRC suggests the existence of a latent, inheritable risk factor. We aimed to compute the probability of a family possessing this risk factor as well as the hazard rate increase for these risk factor carriers. Due to the inheritability of this risk factor, the estimation necessitates a costly marginalization of the likelihood. We propose an improved EM algorithm by applying factor graphs and the sum-product algorithm in the E-step. This reduces the computational complexity from exponential to linear in the number of family members. Our algorithm is as precise as a direct likelihood maximization in a simulation study and a real family study on CRC risk. For 250 simulated families of size 19 and 21, the runtime of our algorithm is faster by a factor of 4 and 29, respectively. On the largest family (23 members) in the real data, our algorithm is 6 times faster. We introduce a flexible and runtime-efficient tool for statistical inference in biomedical event data with latent variables that opens the door for advanced analyses of pedigree data. © 2017 S. Karger AG, Basel.
Inferring physical properties of galaxies from their emission-line spectra

Science.gov (United States)

Ucci, G.; Ferrara, A.; Gallerani, S.; Pallottini, A.

2017-02-01

We present a new approach based on Supervised Machine Learning algorithms to infer key physical properties of galaxies (density, metallicity, column density and ionization parameter) from their emission-line spectra. We introduce a numerical code (called GAME, GAlaxy Machine learning for Emission lines) implementing this method and test it extensively. GAME delivers excellent predictive performances, especially for estimates of metallicity and column densities. We compare GAME with the most widely used diagnostics (e.g. R23, [N II] λ6584/Hα indicators) showing that it provides much better accuracy and wider applicability range. GAME is particularly suitable for use in combination with Integral Field Unit spectroscopy, both for rest-frame optical/UV nebular lines and far-infrared/sub-millimeter lines arising from photodissociation regions. Finally, GAME can also be applied to the analysis of synthetic galaxy maps built from numerical simulations.
A blind algorithm for recovering articulator positions from acoustics

Energy Technology Data Exchange (ETDEWEB)

Hogden, John E [Los Alamos National Laboratory

2009-01-01

MIMICRI is a signal-processing algorithm that has been shown to blindly infer and invert memoryless nonlinear functions of unobservable bandlimited signals, such as the mapping from the unobservable positions of the speech articulators to observable speech sounds. We review results of using MIMICRI on toy problems and on human speech data. We note that MIMICRI requires that the user specify two parameters: the dimensionality and pass-band of the unobservable signals. We show how to use cross-validation to help estimate the passband. An unexpected consequence of this work is that it helps separate signals with overlapping frequency bands.
ANUBIS: artificial neuromodulation using a Bayesian inference system.

Science.gov (United States)

Smith, Benjamin J H; Saaj, Chakravarthini M; Allouis, Elie

2013-01-01

Gain tuning is a crucial part of controller design and depends not only on an accurate understanding of the system in question, but also on the designer's ability to predict what disturbances and other perturbations the system will encounter throughout its operation. This letter presents ANUBIS (artificial neuromodulation using a Bayesian inference system), a novel biologically inspired technique for automatically tuning controller parameters in real time. ANUBIS is based on the Bayesian brain concept and modifies it by incorporating a model of the neuromodulatory system comprising four artificial neuromodulators. It has been applied to the controller of EchinoBot, a prototype walking rover for Martian exploration. ANUBIS has been implemented at three levels of the controller; gait generation, foot trajectory planning using Bézier curves, and foot trajectory tracking using a terminal sliding mode controller. We compare the results to a similar system that has been tuned using a multilayer perceptron. The use of Bayesian inference means that the system retains mathematical interpretability, unlike other intelligent tuning techniques, which use neural networks, fuzzy logic, or evolutionary algorithms. The simulation results show that ANUBIS provides significant improvements in efficiency and adaptability of the three controller components; it allows the robot to react to obstacles and uncertainties faster than the system tuned with the MLP, while maintaining stability and accuracy. As well as advancing rover autonomy, ANUBIS could also be applied to other situations where operating conditions are likely to change or cannot be accurately modeled in advance, such as process control. In addition, it demonstrates one way in which neuromodulation could fit into the Bayesian brain framework.
A statistical method for lung tumor segmentation uncertainty in PET images based on user inference.

Science.gov (United States)

Zheng, Chaojie; Wang, Xiuying; Feng, Dagan

2015-01-01

PET has been widely accepted as an effective imaging modality for lung tumor diagnosis and treatment. However, standard criteria for delineating tumor boundary from PET are yet to develop largely due to relatively low quality of PET images, uncertain tumor boundary definition, and variety of tumor characteristics. In this paper, we propose a statistical solution to segmentation uncertainty on the basis of user inference. We firstly define the uncertainty segmentation band on the basis of segmentation probability map constructed from Random Walks (RW) algorithm; and then based on the extracted features of the user inference, we use Principle Component Analysis (PCA) to formulate the statistical model for labeling the uncertainty band. We validated our method on 10 lung PET-CT phantom studies from the public RIDER collections [1] and 16 clinical PET studies where tumors were manually delineated by two experienced radiologists. The methods were validated using Dice similarity coefficient (DSC) to measure the spatial volume overlap. Our method achieved an average DSC of 0.878 ± 0.078 on phantom studies and 0.835 ± 0.039 on clinical studies.
Polynomial Chaos Surrogates for Bayesian Inference

KAUST Repository

Le Maitre, Olivier

2016-01-06

The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
The zebrafish progranulin gene family and antisense transcripts

Directory of Open Access Journals (Sweden)

Baranowski David

2005-11-01

Full Text Available Abstract Background Progranulin is an epithelial tissue growth factor (also known as proepithelin, acrogranin and PC-cell-derived growth factor that has been implicated in development, wound healing and in the progression of many cancers. The single mammalian progranulin gene encodes a glycoprotein precursor consisting of seven and one half tandemly repeated non-identical copies of the cystine-rich granulin motif. A genome-wide duplication event hypothesized to have occurred at the base of the teleost radiation predicts that mammalian progranulin may be represented by two co-orthologues in zebrafish. Results The cDNAs encoding two zebrafish granulin precursors, progranulins-A and -B, were characterized and found to contain 10 and 9 copies of the granulin motif respectively. The cDNAs and genes encoding the two forms of granulin, progranulins-1 and -2, were also cloned and sequenced. Both latter peptides were found to be encoded by precursors with a simplified architecture consisting of one and one half copies of the granulin motif. A cDNA encoding a chimeric progranulin which likely arises through the mechanism of trans-splicing between grn1 and grn2 was also characterized. A non-coding RNA gene with antisense complementarity to both grn1 and grn2 was identified which may have functional implications with respect to gene dosage, as well as in restricting the formation of the chimeric form of progranulin. Chromosomal localization of the four progranulin (grn genes reveals syntenic conservation for grna only, suggesting that it is the true orthologue of mammalian grn. RT-PCR and whole-mount in situ hybridization analysis of zebrafish grns during development reveals that combined expression of grna and grnb, but not grn1 and grn2, recapitulate many of the expression patterns observed for the murine counterpart. This includes maternal deposition, widespread central nervous system distribution and specific localization within the epithelial
Interactive Instruction in Bayesian Inference

DEFF Research Database (Denmark)

Khan, Azam; Breslav, Simon; Hornbæk, Kasper

2018-01-01

An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
The early emergence and puzzling decline of relational reasoning: Effects of knowledge and search on inferring abstract concepts.

Science.gov (United States)

Walker, Caren M; Bridgers, Sophie; Gopnik, Alison

2016-11-01

We explore the developmental trajectory and underlying mechanisms of abstract relational reasoning. We describe a surprising developmental pattern: Younger learners are better than older ones at inferring abstract causal relations. Walker and Gopnik (2014) demonstrated that toddlers are able to infer that an effect was caused by a relation between two objects (whether they are the same or different), rather than by individual kinds of objects. While these findings are consistent with evidence that infants recognize same-different relations, they contrast with a large literature suggesting that older children tend to have difficulty inferring these relations. Why might this be? In Experiment 1a, we demonstrate that while younger children (18-30-month-olds) have no difficulty learning these relational concepts, older children (36-48-month-olds) fail to draw this abstract inference. Experiment 1b replicates the finding with 18-30-month-olds using a more demanding intervention task. Experiment 2 tests whether this difference in performance might be because older children have developed the general hypothesis that individual kinds of objects are causal - the high initial probability of this alternative hypothesis might override the data that favors the relational hypothesis. Providing additional information falsifying the alternative hypothesis improves older children's performance. Finally, Experiment 3 demonstrates that prompting for explanations during learning also improves performance, even without any additional information. These findings are discussed in light of recent computational and algorithmic theories of learning. Copyright © 2016 Elsevier B.V. All rights reserved.
Inferring Phylogenetic Networks Using PhyloNet.

Science.gov (United States)

Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

2018-07-01

PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
A Novel Fuzzy Algorithm to Introduce New Variables in the Drug Supply Decision-Making Process in Medicine

Directory of Open Access Journals (Sweden)

Jose M. Gonzalez-Cava

2018-01-01

Full Text Available One of the main challenges in medicine is to guarantee an appropriate drug supply according to the real needs of patients. Closed-loop strategies have been widely used to develop automatic solutions based on feedback variables. However, when the variable of interest cannot be directly measured or there is a lack of knowledge behind the process, it turns into a difficult issue to solve. In this research, a novel algorithm to approach this problem is presented. The main objective of this study is to provide a new general algorithm capable of determining the influence of a certain clinical variable in the decision making process for drug supply and then defining an automatic system able to guide the process considering this information. Thus, this new technique will provide a way to validate a given physiological signal as a feedback variable for drug titration. In addition, the result of the algorithm in terms of fuzzy rules and membership functions will define a fuzzy-based decision system for the drug delivery process. The method proposed is based on a Fuzzy Inference System whose structure is obtained through a decision tree algorithm. A four-step methodology is then developed: data collection, preprocessing, Fuzzy Inference System generation, and the validation of results. To test this methodology, the analgesia control scenario was analysed. Specifically, the viability of the Analgesia Nociception Index (ANI as a guiding variable for the analgesic process during surgical interventions was studied. Real data was obtained from fifteen patients undergoing cholecystectomy surgery.
Co-Inheritance Analysis within the Domains of Life Substantially Improves Network Inference by Phylogenetic Profiling.

Directory of Open Access Journals (Sweden)

Junha Shin

Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co

Active inference and learning.

Science.gov (United States)

Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni

2016-09-01

This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Currents on Grassmann algebras

International Nuclear Information System (INIS)

Coquereaux, R.; Ragoucy, E.

1993-09-01

Currents are defined on a Grassmann algebra Gr(N) with N generators as distributions on its exterior algebra (using the symmetric wedge product). The currents are interpreted in terms of Z 2 -graded Hochschild cohomology and closed currents in terms of cyclic cocycles (they are particular multilinear forms on Gr(N)). An explicit construction of the vector space of closed currents of degree p on Gr(N) is given by using Berezin integration. (authors). 10 refs
Inferring Groups of Objects, Preferred Routes, and Facility Locations from Trajectories

DEFF Research Database (Denmark)

Ceikute, Vaida

(i) infer groups of objects traveling together, (ii) determine routes preferred by local drivers, and (iii) identify attractive facility locations. First, we present framework that efficiently supports online discovery of groups of moving objects that travel together. We adopt a sampling......-independent approach that makes no assumptions about when object positions are sampled and that supports the use of approximate trajectories. The framework’s algorithms exploit density-based clustering to identify groups. Such identified groups are scored based on cardinality and duration. With the use of domination...... and similarity notions, groups of low interest are pruned, and a variety of different, interesting groups are returned. Results from empirical studies with real and synthetic data offer insight into the effectiveness and efficiency of the proposed framework. Next, we view GPS trajectories as trips that represent...
GAMBIT: the global and modular beyond-the-standard-model inference tool

Science.gov (United States)

Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Buckley, Andy; Chrząszcz, Marcin; Conrad, Jan; Cornell, Jonathan M.; Dal, Lars A.; Dickinson, Hugh; Edsjö, Joakim; Farmer, Ben; Gonzalo, Tomás E.; Jackson, Paul; Krislock, Abram; Kvellestad, Anders; Lundberg, Johan; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Raklev, Are; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Savage, Christopher; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; White, Martin; Wild, Sebastian

2017-11-01

We describe the open-source global fitting package GAMBIT: the Global And Modular Beyond-the-Standard-Model Inference Tool. GAMBIT combines extensive calculations of observables and likelihoods in particle and astroparticle physics with a hierarchical model database, advanced tools for automatically building analyses of essentially any model, a flexible and powerful system for interfacing to external codes, a suite of different statistical methods and parameter scanning algorithms, and a host of other utilities designed to make scans faster, safer and more easily-extendible than in the past. Here we give a detailed description of the framework, its design and motivation, and the current models and other specific components presently implemented in GAMBIT. Accompanying papers deal with individual modules and present first GAMBIT results. GAMBIT can be downloaded from gambit.hepforge.org.
GAMBIT. The global and modular beyond-the-standard-model inference tool

Energy Technology Data Exchange (ETDEWEB)

Athron, Peter; Balazs, Csaba [Monash University, School of Physics and Astronomy, Melbourne, VIC (Australia); Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); Bringmann, Torsten; Dal, Lars A.; Gonzalo, Tomas E.; Krislock, Abram; Raklev, Are [University of Oslo, Department of Physics, Oslo (Norway); Buckley, Andy [University of Glasgow, SUPA, School of Physics and Astronomy, Glasgow (United Kingdom); Chrzaszcz, Marcin [Universitaet Zuerich, Physik-Institut, Zurich (Switzerland); Polish Academy of Sciences, H. Niewodniczanski Institute of Nuclear Physics, Krakow (Poland); Conrad, Jan; Edsjoe, Joakim; Farmer, Ben; Lundberg, Johan [AlbaNova University Centre, Oskar Klein Centre for Cosmoparticle Physics, Stockholm (Sweden); Stockholm University, Department of Physics, Stockholm (Sweden); Cornell, Jonathan M. [McGill University, Department of Physics, Montreal, QC (Canada); Dickinson, Hugh [University of Minnesota, Minnesota Institute for Astrophysics, Minneapolis, MN (United States); Jackson, Paul; White, Martin [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); University of Adelaide, Department of Physics, Adelaide, SA (Australia); Kvellestad, Anders; Savage, Christopher [NORDITA, Stockholm (Sweden); McKay, James [Imperial College London, Blackett Laboratory, Department of Physics, London (United Kingdom); Mahmoudi, Farvah [Univ Lyon, Univ Lyon 1, ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR5574, Saint-Genis-Laval (France); CERN, Theoretical Physics Department, Geneva (Switzerland); Martinez, Gregory D. [University of California, Physics and Astronomy Department, Los Angeles, CA (United States); Putze, Antje [LAPTh, Universite de Savoie, CNRS, Annecy-le-Vieux (France); Ripken, Joachim [Max Planck Institute for Solar System Research, Goettingen (Germany); Rogan, Christopher [Harvard University, Department of Physics, Cambridge, MA (United States); Saavedra, Aldo [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); The University of Sydney, Faculty of Engineering and Information Technologies, Centre for Translational Data Science, School of Physics, Sydney, NSW (Australia); Scott, Pat [Imperial College London, Blackett Laboratory, Department of Physics, London (United Kingdom); Seo, Seon-Hee [Seoul National University, Department of Physics and Astronomy, Seoul (Korea, Republic of); Serra, Nicola [Universitaet Zuerich, Physik-Institut, Zurich (Switzerland); Weniger, Christoph [University of Amsterdam, GRAPPA, Institute of Physics, Amsterdam (Netherlands); Wild, Sebastian [DESY, Hamburg (Germany); Collaboration: The GAMBIT Collaboration

2017-11-15

We describe the open-source global fitting package GAMBIT: the Global And Modular Beyond-the-Standard-Model Inference Tool. GAMBIT combines extensive calculations of observables and likelihoods in particle and astroparticle physics with a hierarchical model database, advanced tools for automatically building analyses of essentially any model, a flexible and powerful system for interfacing to external codes, a suite of different statistical methods and parameter scanning algorithms, and a host of other utilities designed to make scans faster, safer and more easily-extendible than in the past. Here we give a detailed description of the framework, its design and motivation, and the current models and other specific components presently implemented in GAMBIT. Accompanying papers deal with individual modules and present first GAMBIT results. GAMBIT can be downloaded from gambit.hepforge.org. (orig.)
GAMBIT. The global and modular beyond-the-standard-model inference tool

International Nuclear Information System (INIS)

Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Dal, Lars A.; Gonzalo, Tomas E.; Krislock, Abram; Raklev, Are; Buckley, Andy; Chrzaszcz, Marcin; Conrad, Jan; Edsjoe, Joakim; Farmer, Ben; Lundberg, Johan; Cornell, Jonathan M.; Dickinson, Hugh; Jackson, Paul; White, Martin; Kvellestad, Anders; Savage, Christopher; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; Wild, Sebastian

2017-01-01

We describe the open-source global fitting package GAMBIT: the Global And Modular Beyond-the-Standard-Model Inference Tool. GAMBIT combines extensive calculations of observables and likelihoods in particle and astroparticle physics with a hierarchical model database, advanced tools for automatically building analyses of essentially any model, a flexible and powerful system for interfacing to external codes, a suite of different statistical methods and parameter scanning algorithms, and a host of other utilities designed to make scans faster, safer and more easily-extendible than in the past. Here we give a detailed description of the framework, its design and motivation, and the current models and other specific components presently implemented in GAMBIT. Accompanying papers deal with individual modules and present first GAMBIT results. GAMBIT can be downloaded from gambit.hepforge.org. (orig.)
A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

Science.gov (United States)

Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

2018-05-09

The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.
A simple algorithm to estimate genetic variance in an animal threshold model using Bayesian inference Genetics Selection Evolution 2010, 42:29

DEFF Research Database (Denmark)

Ødegård, Jørgen; Meuwissen, Theo HE; Heringstad, Bjørg

2010-01-01

Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where...... records exist for the parents). Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to the sire-dam model). Conclusions The new algorithm to estimate genetic parameters via Gibbs sampling solves the bias problems typically occurring in animal...... individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (co)variance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative...
Design of a modified adaptive neuro fuzzy inference system classifier for medical diagnosis of Pima Indians Diabetes

Science.gov (United States)

Sagir, Abdu Masanawa; Sathasivam, Saratha

2017-08-01

Medical diagnosis is the process of determining which disease or medical condition explains a person's determinable signs and symptoms. Diagnosis of most of the diseases is very expensive as many tests are required for predictions. This paper aims to introduce an improved hybrid approach for training the adaptive network based fuzzy inference system with Modified Levenberg-Marquardt algorithm using analytical derivation scheme for computation of Jacobian matrix. The goal is to investigate how certain diseases are affected by patient's characteristics and measurement such as abnormalities or a decision about presence or absence of a disease. To achieve an accurate diagnosis at this complex stage of symptom analysis, the physician may need efficient diagnosis system to classify and predict patient condition by using an adaptive neuro fuzzy inference system (ANFIS) pre-processed by grid partitioning. The proposed hybridised intelligent system was tested with Pima Indian Diabetes dataset obtained from the University of California at Irvine's (UCI) machine learning repository. The proposed method's performance was evaluated based on training and test datasets. In addition, an attempt was done to specify the effectiveness of the performance measuring total accuracy, sensitivity and specificity. In comparison, the proposed method achieves superior performance when compared to conventional ANFIS based gradient descent algorithm and some related existing methods. The software used for the implementation is MATLAB R2014a (version 8.3) and executed in PC Intel Pentium IV E7400 processor with 2.80 GHz speed and 2.0 GB of RAM.
A new normalizing algorithm for BAC CGH arrays with quality control metrics.

Science.gov (United States)

Miecznikowski, Jeffrey C; Gaile, Daniel P; Liu, Song; Shepherd, Lori; Nowak, Norma

2011-01-01

The main focus in pin-tip (or print-tip) microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH) experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose "SmoothArray", a new method to preprocess comparative genomic hybridization (CGH) bacterial artificial chromosome (BAC) arrays and we show the effects on a cancer dataset. As part of our R software package "aCGHplus," this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.
A New Normalizing Algorithm for BAC CGH Arrays with Quality Control Metrics

Directory of Open Access Journals (Sweden)

Jeffrey C. Miecznikowski

2011-01-01

Full Text Available The main focus in pin-tip (or print-tip microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose “SmoothArray”, a new method to preprocess comparative genomic hybridization (CGH bacterial artificial chromosome (BAC arrays and we show the effects on a cancer dataset. As part of our R software package “aCGHplus,” this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.
Active Inference, homeostatic regulation and adaptive behavioural control.

Science.gov (United States)

Pezzulo, Giovanni; Rigoli, Francesco; Friston, Karl

2015-11-01

We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Generative Inferences Based on Learned Relations

Science.gov (United States)

Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.

2017-01-01

A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…
Measurement and inference of profile soil-water dynamics at different hillslope positions in a semiarid agricultural watershed

Science.gov (United States)

Green, Timothy R.; Erskine, Robert H.

2011-12-01

Dynamics of profile soil water vary with terrain, soil, and plant characteristics. The objectives addressed here are to quantify dynamic soil water content over a range of slope positions, infer soil profile water fluxes, and identify locations most likely influenced by multidimensional flow. The instrumented 56 ha watershed lies mostly within a dryland (rainfed) wheat field in semiarid eastern Colorado. Dielectric capacitance sensors were used to infer hourly soil water content for approximately 8 years (minus missing data) at 18 hillslope positions and four or more depths. Based on previous research and a new algorithm, sensor measurements (resonant frequency) were rescaled to estimate soil permittivity, then corrected for temperature effects on bulk electrical conductivity before inferring soil water content. Using a mass-conservation method, we analyzed multitemporal changes in soil water content at each sensor to infer the dynamics of water flux at different depths and landscape positions. At summit positions vertical processes appear to control profile soil water dynamics. At downslope positions infrequent overland flow and unsaturated subsurface lateral flow appear to influence soil water dynamics. Crop water use accounts for much of the variability in soil water between transects that are either cropped or fallow in alternating years, while soil hydraulic properties and near-surface hydrology affect soil water variability across landscape positions within each management zone. The observed spatiotemporal patterns exhibit the joint effects of short-term hydrology and long-term soil development. Quantitative methods of analyzing soil water patterns in space and time improve our understanding of dominant soil hydrological processes and provide alternative measures of model performance.
Dissociation of Frontotemporal Dementia–Related Deficits and Neuroinflammation in Progranulin Haploinsufficient Mice

Science.gov (United States)

Filiano, Anthony J.; Martens, Lauren Herl; Young, Allen H.; Warmus, Brian A.; Zhou, Ping; Diaz-Ramirez, Grisell; Jiao, Jian; Zhang, Zhijun; Huang, Eric J.; Gao, Fen-Biao; Farese, Robert V.; Roberson, Erik D.

2013-01-01

Frontotemporal dementia (FTD) is a neurodegenerative disease with hallmark deficits in social and emotional function. Heterozygous loss-of-function mutations in GRN, the progranulin gene, are a common genetic cause of the disorder, but the mechanisms by which progranulin haploinsufficiency causes neuronal dysfunction in FTD are unclear. Homozygous progranulin knockout (Grn−/−) mice have been studied as a model of this disorder and show behavioral deficits and a neuroinflammatory phenotype with robust microglial activation. However, homozygous GRN mutations causing complete progranulin deficiency were recently shown to cause a different neurological disorder, neuronal ceroid lipofuscinosis, suggesting that the total absence of progranulin may have effects distinct from those of haploinsufficiency. Here, we studied progranulin heterozygous (Grn+/−) mice, which model progranulin haploinsufficiency. We found that Grn+/− mice developed age-dependent social and emotional deficits potentially relevant to FTD. However, unlike Grn−/− mice, behavioral deficits in Grn+/− mice occurred in the absence of gliosis or increased expression of tumor necrosis factor–α. Instead, we found neuronal abnormalities in the amygdala, an area of selective vulnerability in FTD, in Grn+/− mice. Our findings indicate that FTD-related deficits due to progranulin haploinsufficiency can develop in the absence of detectable gliosis and neuroinflammation, thereby dissociating microglial activation from functional deficits and suggesting an important effect of progranulin deficiency on neurons. PMID:23516300
Brain networks for confidence weighting and hierarchical inference during probabilistic learning.

Science.gov (United States)

Meyniel, Florent; Dehaene, Stanislas

2017-05-09

Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This "confidence weighting" implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain's learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences.
Brain networks for confidence weighting and hierarchical inference during probabilistic learning

Science.gov (United States)

Meyniel, Florent; Dehaene, Stanislas

2017-01-01

Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This “confidence weighting” implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain’s learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences. PMID:28439014
Parametric statistical inference basic theory and modern approaches

CERN Document Server

Zacks, Shelemyahu; Tsokos, C P

1981-01-01

Parametric Statistical Inference: Basic Theory and Modern Approaches presents the developments and modern trends in statistical inference to students who do not have advanced mathematical and statistical preparation. The topics discussed in the book are basic and common to many fields of statistical inference and thus serve as a jumping board for in-depth study. The book is organized into eight chapters. Chapter 1 provides an overview of how the theory of statistical inference is presented in subsequent chapters. Chapter 2 briefly discusses statistical distributions and their properties. Chapt
Optimal plasma progranulin cutoff value for predicting null progranulin mutations in neurodegenerative diseases: a multicenter Italian study.

Science.gov (United States)

Ghidoni, Roberta; Stoppani, Elena; Rossi, Giacomina; Piccoli, Elena; Albertini, Valentina; Paterlini, Anna; Glionna, Michela; Pegoiani, Eleonora; Agnati, Luigi F; Fenoglio, Chiara; Scarpini, Elio; Galimberti, Daniela; Morbin, Michela; Tagliavini, Fabrizio; Binetti, Giuliano; Benussi, Luisa

2012-01-01

Recently, attention was drawn to a role for progranulin in the central nervous system with the identification of mutations in the progranulin gene (GRN) as an important cause of frontotemporal lobar degeneration. GRN mutations are associated with a strong reduction of circulating progranulin and widely variable clinical phenotypes: thus, the dosage of plasma progranulin is a useful tool for a quick and inexpensive large-scale screening of carriers of GRN mutations. To establish the best cutoff threshold for normal versus abnormal levels of plasma progranulin. 309 cognitively healthy controls (25-87 years of age), 72 affected and unaffected GRN+ null mutation carriers (24-86 years of age), 3 affected GRN missense mutation carriers, 342 patients with neurodegenerative diseases and 293 subjects with mild cognitive impairment were enrolled at the Memory Clinic, IRCCS S. Giovanni di Dio-Fatebenefratelli, Brescia, Italy, and at the Alzheimer Unit, Ospedale Maggiore Policlinico and IRCCS Istituto Neurologico C. Besta, Milan, Italy. Plasma progranulin levels were measured using an ELISA kit (AdipoGen Inc., Seoul, Korea). Plasma progranulin did not correlate with age, gender or body mass index. We established a new plasma progranulin protein cutoff level of 61.55 ng/ml that identifies, with a specificity of 99.6% and a sensitivity of 95.8%, null mutation carriers among subjects attending to a memory clinic. Affected and unaffected GRN null mutation carriers did not differ in terms of circulating progranulin protein (p = 0.686). A significant disease anticipation was observed in GRN+ subjects with the lowest progranulin levels. We propose a new plasma progranulin protein cutoff level useful for clinical practice. Copyright © 2011 S. Karger AG, Basel.
Restoring neuronal progranulin reverses deficits in a mouse model of frontotemporal dementia.

Science.gov (United States)

Arrant, Andrew E; Filiano, Anthony J; Unger, Daniel E; Young, Allen H; Roberson, Erik D

2017-05-01

Loss-of-function mutations in progranulin (GRN), a secreted glycoprotein expressed by neurons and microglia, are a common autosomal dominant cause of frontotemporal dementia, a neurodegenerative disease commonly characterized by disrupted social and emotional behaviour. GRN mutations are thought to cause frontotemporal dementia through progranulin haploinsufficiency, therefore, boosting progranulin expression from the intact allele is a rational treatment strategy. However, this approach has not been tested in an animal model of frontotemporal dementia and it is unclear if boosting progranulin could correct pre-existing deficits. Here, we show that adeno-associated virus-driven expression of progranulin in the medial prefrontal cortex reverses social dominance deficits in Grn+/- mice, an animal model of frontotemporal dementia due to GRN mutations. Adeno-associated virus-progranulin also corrected lysosomal abnormalities in Grn+/- mice. The adeno-associated virus-progranulin vector only transduced neurons, suggesting that restoring neuronal progranulin is sufficient to correct deficits in Grn+/- mice. To further test the role of neuronal progranulin in the development of frontotemporal dementia-related deficits, we generated two neuronal progranulin-deficient mouse lines using CaMKII-Cre and Nestin-Cre. Measuring progranulin levels in these lines indicated that most brain progranulin is derived from neurons. Both neuronal progranulin-deficient lines developed social dominance deficits similar to those in global Grn+/- mice, showing that neuronal progranulin deficiency is sufficient to disrupt social behaviour. These data support the concept of progranulin-boosting therapies for frontotemporal dementia and highlight an important role for neuron-derived progranulin in maintaining normal social function. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Dynamic modeling of genetic networks using genetic algorithm and S-system.

Science.gov (United States)

Kikuchi, Shinichi; Tominaga, Daisuke; Arita, Masanori; Takahashi, Katsutoshi; Tomita, Masaru

2003-03-22

The modeling of system dynamics of genetic networks, metabolic networks or signal transduction cascades from time-course data is formulated as a reverse-problem. Previous studies focused on the estimation of only network structures, and they were ineffective in inferring a network structure with feedback loops. We previously proposed a method to predict not only the network structure but also its dynamics using a Genetic Algorithm (GA) and an S-system formalism. However, it could predict only a small number of parameters and could rarely obtain essential structures. In this work, we propose a unified extension of the basic method. Notable improvements are as follows: (1) an additional term in its evaluation function that aims at eliminating futile parameters; (2) a crossover method called Simplex Crossover (SPX) to improve its optimization ability; and (3) a gradual optimization strategy to increase the number of predictable parameters. The proposed method is implemented as a C program called PEACE1 (Predictor by Evolutionary Algorithms and Canonical Equations 1). Its performance was compared with the basic method. The comparison showed that: (1) the convergence rate increased about 5-fold; (2) the optimization speed was raised about 1.5-fold; and (3) the number of predictable parameters was increased about 5-fold. Moreover, we successfully inferred the dynamics of a small genetic network constructed with 60 parameters for 5 network variables and feedback loops using only time-course data of gene expression.
Velocity of climate change algorithms for guiding conservation and management.

Science.gov (United States)

Hamann, Andreas; Roberts, David R; Barber, Quinn E; Carroll, Carlos; Nielsen, Scott E

2015-02-01

The velocity of climate change is an elegant analytical concept that can be used to evaluate the exposure of organisms to climate change. In essence, one divides the rate of climate change by the rate of spatial climate variability to obtain a speed at which species must migrate over the surface of the earth to maintain constant climate conditions. However, to apply the algorithm for conservation and management purposes, additional information is needed to improve realism at local scales. For example, destination information is needed to ensure that vectors describing speed and direction of required migration do not point toward a climatic cul-de-sac by pointing beyond mountain tops. Here, we present an analytical approach that conforms to standard velocity algorithms if climate equivalents are nearby. Otherwise, the algorithm extends the search for climate refugia, which can be expanded to search for multivariate climate matches. With source and destination information available, forward and backward velocities can be calculated allowing useful inferences about conservation of species (present-to-future velocities) and management of species populations (future-to-present velocities). © 2014 The Authors. Global Change Biology Published by John Wiley & Sons Ltd.
Variational inference & deep learning: A new synthesis

OpenAIRE

Kingma, D.P.

2017-01-01

In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning : A new synthesis

NARCIS (Netherlands)

Kingma, D.P.

2017-01-01

In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
A historical perspective of algorithmic lateral inhibition and accumulative computation in computer vision

OpenAIRE

Delgado García, Ana E.; Carmona, Enrique; Fernández Caballero, Antonio; López Bonal, María Teresa

2011-01-01

Certainly, one of the prominent ideas of Professor José Mira was that it is absolutely mandatory to specify the mechanisms and/or processes underlying each task and inference mentioned in an architecture in order to make operational that architecture. The conjecture of the last fifteen years of joint research has been that any bottom-up organization may be made operational using two biologically inspired methods called ?algorithmic lateral inhibition?, a generalization of lateral inhibition a...
Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling

NARCIS (Netherlands)

Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.

2006-01-01

We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar
BAYESIAN INFERENCE OF CMB GRAVITATIONAL LENSING

Energy Technology Data Exchange (ETDEWEB)

Anderes, Ethan [Department of Statistics, University of California, Davis, CA 95616 (United States); Wandelt, Benjamin D.; Lavaux, Guilhem [Sorbonne Universités, UPMC Univ Paris 06 and CNRS, UMR7095, Institut d’Astrophysique de Paris, F-75014, Paris (France)

2015-08-01

The Planck satellite, along with several ground-based telescopes, has mapped the cosmic microwave background (CMB) at sufficient resolution and signal-to-noise so as to allow a detection of the subtle distortions due to the gravitational influence of the intervening matter distribution. A natural modeling approach is to write a Bayesian hierarchical model for the lensed CMB in terms of the unlensed CMB and the lensing potential. So far there has been no feasible algorithm for inferring the posterior distribution of the lensing potential from the lensed CMB map. We propose a solution that allows efficient Markov Chain Monte Carlo sampling from the joint posterior of the lensing potential and the unlensed CMB map using the Hamiltonian Monte Carlo technique. The main conceptual step in the solution is a re-parameterization of CMB lensing in terms of the lensed CMB and the “inverse lensing” potential. We demonstrate a fast implementation on simulated data, including noise and a sky cut, that uses a further acceleration based on a very mild approximation of the inverse lensing potential. We find that the resulting Markov Chain has short correlation lengths and excellent convergence properties, making it promising for applications to high-resolution CMB data sets in the future.
Reasoning about Informal Statistical Inference: One Statistician's View

Science.gov (United States)

Rossman, Allan J.

2008-01-01

This paper identifies key concepts and issues associated with the reasoning of informal statistical inference. I focus on key ideas of inference that I think all students should learn, including at secondary level as well as tertiary. I argue that a fundamental component of inference is to go beyond the data at hand, and I propose that statistical…
Meta-learning framework applied in bioinformatics inference system design.

Science.gov (United States)

Arredondo, Tomás; Ormazábal, Wladimir

2015-01-01

This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.
A Markov chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates

DEFF Research Database (Denmark)

Hobolth, Asger

2008-01-01

-dimensional integrals required in the EM algorithm are estimated using MCMC sampling. The MCMC sampler requires simulation of sample paths from a continuous time Markov process, conditional on the beginning and ending states and the paths of the neighboring sites. An exact path sampling algorithm is developed......The evolution of DNA sequences can be described by discrete state continuous time Markov processes on a phylogenetic tree. We consider neighbor-dependent evolutionary models where the instantaneous rate of substitution at a site depends on the states of the neighboring sites. Neighbor......-dependent substitution models are analytically intractable and must be analyzed using either approximate or simulation-based methods. We describe statistical inference of neighbor-dependent models using a Markov chain Monte Carlo expectation maximization (MCMC-EM) algorithm. In the MCMC-EM algorithm, the high...
Boosting Bayesian parameter inference of nonlinear stochastic differential equation models by Hamiltonian scale separation.

Science.gov (United States)

Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

2016-04-01

Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Bayesian Inference of a Multivariate Regression Model

Directory of Open Access Journals (Sweden)

Marick S. Sinay

2014-01-01

Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Integrated artificial intelligence algorithm for skin detection

Directory of Open Access Journals (Sweden)

Bush Idoko John

2018-01-01

Full Text Available The detection of skin colour has been a useful and renowned technique due to its wide range of application in both analyses based on diagnostic and human computer interactions. Various problems could be solved by simply providing an appropriate method for pixel-like skin parts. Presented in this study is a colour segmentation algorithm that works directly in RGB colour space without converting the colour space. Genfis function as used in this study formed the Sugeno fuzzy network and utilizing Fuzzy C-Mean (FCM clustering rule, clustered the data and for each cluster/class a rule is generated. Finally, corresponding output from data mapping of pseudo-polynomial is obtained from input dataset to the adaptive neuro fuzzy inference system (ANFIS.
Towards a Framework for Evaluating and Comparing Diagnosis Algorithms

Science.gov (United States)

Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia,David; Kuhn, Lukas; deKleer, Johan; vanGemund, Arjan; Feldman, Alexander

2009-01-01

Diagnostic inference involves the detection of anomalous system behavior and the identification of its cause, possibly down to a failed unit or to a parameter of a failed unit. Traditional approaches to solving this problem include expert/rule-based, model-based, and data-driven methods. Each approach (and various techniques within each approach) use different representations of the knowledge required to perform the diagnosis. The sensor data is expected to be combined with these internal representations to produce the diagnosis result. In spite of the availability of various diagnosis technologies, there have been only minimal efforts to develop a standardized software framework to run, evaluate, and compare different diagnosis technologies on the same system. This paper presents a framework that defines a standardized representation of the system knowledge, the sensor data, and the form of the diagnosis results and provides a run-time architecture that can execute diagnosis algorithms, send sensor data to the algorithms at appropriate time steps from a variety of sources (including the actual physical system), and collect resulting diagnoses. We also define a set of metrics that can be used to evaluate and compare the performance of the algorithms, and provide software to calculate the metrics.
Inference and Analysis of Population Structure Using Genetic Data and Network Theory.

Science.gov (United States)

Greenbaum, Gili; Templeton, Alan R; Bar-David, Shirli

2016-04-01

Clustering individuals to subpopulations based on genetic data has become commonplace in many genetic studies. Inference about population structure is most often done by applying model-based approaches, aided by visualization using distance-based approaches such as multidimensional scaling. While existing distance-based approaches suffer from a lack of statistical rigor, model-based approaches entail assumptions of prior conditions such as that the subpopulations are at Hardy-Weinberg equilibria. Here we present a distance-based approach for inference about population structure using genetic data by defining population structure using network theory terminology and methods. A network is constructed from a pairwise genetic-similarity matrix of all sampled individuals. The community partition, a partition of a network to dense subgraphs, is equated with population structure, a partition of the population to genetically related groups. Community-detection algorithms are used to partition the network into communities, interpreted as a partition of the population to subpopulations. The statistical significance of the structure can be estimated by using permutation tests to evaluate the significance of the partition's modularity, a network theory measure indicating the quality of community partitions. To further characterize population structure, a new measure of the strength of association (SA) for an individual to its assigned community is presented. The strength of association distribution (SAD) of the communities is analyzed to provide additional population structure characteristics, such as the relative amount of gene flow experienced by the different subpopulations and identification of hybrid individuals. Human genetic data and simulations are used to demonstrate the applicability of the analyses. The approach presented here provides a novel, computationally efficient model-free method for inference about population structure that does not entail assumption of
Incorporating time-delays in S-System model for reverse engineering genetic networks.

Science.gov (United States)

Chowdhury, Ahsan Raja; Chetty, Madhu; Vinh, Nguyen Xuan

2013-06-18

In any gene regulatory network (GRN), the complex interactions occurring amongst transcription factors and target genes can be either instantaneous or time-delayed. However, many existing modeling approaches currently applied for inferring GRNs are unable to represent both these interactions simultaneously. As a result, all these approaches cannot detect important interactions of the other type. S-System model, a differential equation based approach which has been increasingly applied for modeling GRNs, also suffers from this limitation. In fact, all S-System based existing modeling approaches have been designed to capture only instantaneous interactions, and are unable to infer time-delayed interactions. In this paper, we propose a novel Time-Delayed S-System (TDSS) model which uses a set of delay differential equations to represent the system dynamics. The ability to incorporate time-delay parameters in the proposed S-System model enables simultaneous modeling of both instantaneous and time-delayed interactions. Furthermore, the delay parameters are not limited to just positive integer values (corresponding to time stamps in the data), but can also take fractional values. Moreover, we also propose a new criterion for model evaluation exploiting the sparse and scale-free nature of GRNs to effectively narrow down the search space, which not only reduces the computation time significantly but also improves model accuracy. The evaluation criterion systematically adapts the max-min in-degrees and also systematically balances the effect of network accuracy and complexity during optimization. The four well-known performance measures applied to the experimental studies on synthetic networks with various time-delayed regulations clearly demonstrate that the proposed method can capture both instantaneous and delayed interactions correctly with high precision. The experiments carried out on two well-known real-life networks, namely IRMA and SOS DNA repair network in
Statistical inference and Aristotle's Rhetoric.

Science.gov (United States)

Macdonald, Ranald R

2004-11-01

Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
Preliminary Test of Adaptive Neuro-Fuzzy Inference System Controller for Spacecraft Attitude Control

Directory of Open Access Journals (Sweden)

Sung-Woo Kim

2012-12-01

Full Text Available The problem of spacecraft attitude control is solved using an adaptive neuro-fuzzy inference system (ANFIS. An ANFIS produces a control signal for one of the three axes of a spacecraft’s body frame, so in total three ANFISs are constructed for 3-axis attitude control. The fuzzy inference system of the ANFIS is initialized using a subtractive clustering method. The ANFIS is trained by a hybrid learning algorithm using the data obtained from attitude control simulations using state-dependent Riccati equation controller. The training data set for each axis is composed of state errors for 3 axes (roll, pitch, and yaw and a control signal for one of the 3 axes. The stability region of the ANFIS controller is estimated numerically based on Lyapunov stability theory using a numerical method to calculate Jacobian matrix. To measure the performance of the ANFIS controller, root mean square error and correlation factor are used as performance indicators. The performance is tested on two ANFIS controllers trained in different conditions. The test results show that the performance indicators are proper in the sense that the ANFIS controller with the larger stability region provides better performance according to the performance indicators.
Children's and adults' judgments of the certainty of deductive inferences, inductive inferences, and guesses.

Science.gov (United States)

Pillow, Bradford H; Pearson, Raeanne M; Hecht, Mary; Bremer, Amanda

2010-01-01

Children and adults rated their own certainty following inductive inferences, deductive inferences, and guesses. Beginning in kindergarten, participants rated deductions as more certain than weak inductions or guesses. Deductions were rated as more certain than strong inductions beginning in Grade 3, and fourth-grade children and adults differentiated strong inductions, weak inductions, and informed guesses from pure guesses. By Grade 3, participants also gave different types of explanations for their deductions and inductions. These results are discussed in relation to children's concepts of cognitive processes, logical reasoning, and epistemological development.
Genetic Algorithm-Based Optimization to Match Asteroid Energy Deposition Curves

Science.gov (United States)

Tarano, Ana; Mathias, Donovan; Wheeler, Lorien; Close, Sigrid

2018-01-01

An asteroid entering Earth's atmosphere deposits energy along its path due to thermal ablation and dissipative forces that can be measured by ground-based and spaceborne instruments. Inference of pre-entry asteroid properties and characterization of the atmospheric breakup is facilitated by using an analytic fragment-cloud model (FCM) in conjunction with a Genetic Algorithm (GA). This optimization technique is used to inversely solve for the asteroid's entry properties, such as diameter, density, strength, velocity, entry angle, and strength scaling, from simulations using FCM. The previous parameters' fitness evaluation involves minimizing error to ascertain the best match between the physics-based calculated energy deposition and the observed meteors. This steady-state GA provided sets of solutions agreeing with literature, such as the meteor from Chelyabinsk, Russia in 2013 and Tagish Lake, Canada in 2000, which were used as case studies in order to validate the optimization routine. The assisted exploration and exploitation of this multi-dimensional search space enables inference and uncertainty analysis that can inform studies of near-Earth asteroids and consequently improve risk assessment.

Inferring the role of transcription factors in regulatory networks

Directory of Open Access Journals (Sweden)

Le Borgne Michel

2008-05-01

Full Text Available Abstract Background Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. Results We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges, and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions, by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions. In addition, we report predictions for 14.5% of all interactions. Conclusion Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine
A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults.

Science.gov (United States)

Sun, Rui; Cheng, Qi; Wang, Guanyu; Ochieng, Washington Yotto

2017-09-29

The use of Unmanned Aerial Vehicles (UAVs) has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs' flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS)-based approach is presented for the detection of on-board navigation sensor faults in UAVs. Contrary to the classic UAV sensor fault detection algorithms, based on predefined or modelled faults, the proposed algorithm combines an online data training mechanism with the ANFIS-based decision system. The main advantages of this algorithm are that it allows real-time model-free residual analysis from Kalman Filter (KF) estimates and the ANFIS to build a reliable fault detection system. In addition, it allows fast and accurate detection of faults, which makes it suitable for real-time applications. Experimental results have demonstrated the effectiveness of the proposed fault detection method in terms of accuracy and misdetection rate.
A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults

Directory of Open Access Journals (Sweden)

Rui Sun

2017-09-01

Full Text Available The use of Unmanned Aerial Vehicles (UAVs has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs’ flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS-based approach is presented for the detection of on-board navigation sensor faults in UAVs. Contrary to the classic UAV sensor fault detection algorithms, based on predefined or modelled faults, the proposed algorithm combines an online data training mechanism with the ANFIS-based decision system. The main advantages of this algorithm are that it allows real-time model-free residual analysis from Kalman Filter (KF estimates and the ANFIS to build a reliable fault detection system. In addition, it allows fast and accurate detection of faults, which makes it suitable for real-time applications. Experimental results have demonstrated the effectiveness of the proposed fault detection method in terms of accuracy and misdetection rate.
Deep Learning for Population Genetic Inference.

Science.gov (United States)

Sheehan, Sara; Song, Yun S

2016-03-01

Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Towards a computational- and algorithmic-level account of concept blending using analogies and amalgams

Science.gov (United States)

Besold, Tarek R.; Kühnberger, Kai-Uwe; Plaza, Enric

2017-10-01

Concept blending - a cognitive process which allows for the combination of certain elements (and their relations) from originally distinct conceptual spaces into a new unified space combining these previously separate elements, and enables reasoning and inference over the combination - is taken as a key element of creative thought and combinatorial creativity. In this article, we summarise our work towards the development of a computational-level and algorithmic-level account of concept blending, combining approaches from computational analogy-making and case-based reasoning (CBR). We present the theoretical background, as well as an algorithmic proposal integrating higher-order anti-unification matching and generalisation from analogy with amalgams from CBR. The feasibility of the approach is then exemplified in two case studies.
Accessing primary care Big Data: the development of a software algorithm to explore the rich content of consultation records.

Science.gov (United States)

MacRae, J; Darlow, B; McBain, L; Jones, O; Stubbe, M; Turner, N; Dowell, A

2015-08-21

To develop a natural language processing software inference algorithm to classify the content of primary care consultations using electronic health record Big Data and subsequently test the algorithm's ability to estimate the prevalence and burden of childhood respiratory illness in primary care. Algorithm development and validation study. To classify consultations, the algorithm is designed to interrogate clinical narrative entered as free text, diagnostic (Read) codes created and medications prescribed on the day of the consultation. Thirty-six consenting primary care practices from a mixed urban and semirural region of New Zealand. Three independent sets of 1200 child consultation records were randomly extracted from a data set of all general practitioner consultations in participating practices between 1 January 2008-31 December 2013 for children under 18 years of age (n=754,242). Each consultation record within these sets was independently classified by two expert clinicians as respiratory or non-respiratory, and subclassified according to respiratory diagnostic categories to create three 'gold standard' sets of classified records. These three gold standard record sets were used to train, test and validate the algorithm. Sensitivity, specificity, positive predictive value and F-measure were calculated to illustrate the algorithm's ability to replicate judgements of expert clinicians within the 1200 record gold standard validation set. The algorithm was able to identify respiratory consultations in the 1200 record validation set with a sensitivity of 0.72 (95% CI 0.67 to 0.78) and a specificity of 0.95 (95% CI 0.93 to 0.98). The positive predictive value of algorithm respiratory classification was 0.93 (95% CI 0.89 to 0.97). The positive predictive value of the algorithm classifying consultations as being related to specific respiratory diagnostic categories ranged from 0.68 (95% CI 0.40 to 1.00; other respiratory conditions) to 0.91 (95% CI 0.79 to 1
A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression.

Science.gov (United States)

Liu, Fang; Eugenio, Evercita C

2018-04-01

Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
Using Alien Coins to Test Whether Simple Inference Is Bayesian

Science.gov (United States)

Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.

2016-01-01

Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
On Maximum Entropy and Inference

Directory of Open Access Journals (Sweden)

Luigi Gresele

2017-11-01

Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.
PREFACE: ELC International Meeting on Inference, Computation, and Spin Glasses (ICSG2013)

Science.gov (United States)

Kabashima, Yoshiyuki; Hukushima, Koji; Inoue, Jun-ichi; Tanaka, Toshiyuki; Watanabe, Osamu

2013-12-01

The close relationship between probability-based inference and statistical mechanics of disordered systems has been noted for some time. This relationship has provided researchers with a theoretical foundation in various fields of information processing for analytical performance evaluation and construction of efficient algorithms based on message-passing or Monte Carlo sampling schemes. The ELC International Meeting on 'Inference, Computation, and Spin Glasses (ICSG2013)', was held in Sapporo 28-30 July 2013. The meeting was organized as a satellite meeting of STATPHYS25 in order to offer a forum where concerned researchers can assemble and exchange information on the latest results and newly established methodologies, and discuss future directions of the interdisciplinary studies between statistical mechanics and information sciences. Financial support from Grant-in-Aid for Scientific Research on Innovative Areas, MEXT, Japan 'Exploring the Limits of Computation (ELC)' is gratefully acknowledged. We are pleased to publish 23 papers contributed by invited speakers of ICSG2013 in this volume of Journal of Physics: Conference Series. We hope that this volume will promote further development of this highly vigorous interdisciplinary field between statistical mechanics and information/computer science. Editors and ICSG2013 Organizing Committee: Koji Hukushima Jun-ichi Inoue (Local Chair of ICSG2013) Yoshiyuki Kabashima (Editor-in-Chief) Toshiyuki Tanaka Osamu Watanabe (General Chair of ICSG2013)
Compiling Relational Bayesian Networks for Exact Inference

DEFF Research Database (Denmark)

Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan

2004-01-01

We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
MiR-145 mediates zebrafish hepatic outgrowth through progranulin A signaling.

Directory of Open Access Journals (Sweden)

Ya-Wen Li

Full Text Available MicroRNAs (miRs are mRNA-regulatory molecules that fine-tune gene expression and modulate both processes of development and tumorigenesis. Our previous studies identified progranulin A (GrnA as a growth factor which induces zebrafish hepatic outgrowth through MET signaling. We also found that miR-145 is one of potential fine-tuning regulators of GrnA involved in embryonic hepatic outgrowth. The low level of miR-145 seen in hepatocarinogenesis has been shown to promote pathological liver growth. However, little is known about the regulatory mechanism of miR-145 in embryonic liver development. In this study, we demonstrate a significant decrease in miR-145 expression during hepatogenesis. We modulate miR-145 expression in zebrafish embryos by injection with a miR-145 mimic or a miR-145 hairpin inhibitor. Altered embryonic liver outgrowth is observed in response to miR-145 expression modulation. We also confirm a critical role of miR-145 in hepatic outgrowth by using whole-mount in situ hybridization. Loss of miR-145 expression in embryos results in hepatic cell proliferation, and vice versa. Furthermore, we demonstrate that GrnA is a target of miR-145 and GrnA-induced MET signaling is also regulated by miR-145 as determined by luciferase reporter assay and gene expression analysis, respectively. In addition, co-injection of GrnA mRNA with miR-145 mimic or MO-GrnA with miR-145 inhibitor restores the liver defects caused by dysregulation of miR-145 expression. In conclusion, our findings suggest an important role of miR-145 in regulating GrnA-dependent hepatic outgrowth in zebrafish embryonic development.
Study on Data Clustering and Intelligent Decision Algorithm of Indoor Localization

Science.gov (United States)

Liu, Zexi

2018-01-01

Indoor positioning technology enables the human beings to have the ability of positional perception in architectural space, and there is a shortage of single network coverage and the problem of location data redundancy. So this article puts forward the indoor positioning data clustering algorithm and intelligent decision-making research, design the basic ideas of multi-source indoor positioning technology, analyzes the fingerprint localization algorithm based on distance measurement, position and orientation of inertial device integration. By optimizing the clustering processing of massive indoor location data, the data normalization pretreatment, multi-dimensional controllable clustering center and multi-factor clustering are realized, and the redundancy of locating data is reduced. In addition, the path is proposed based on neural network inference and decision, design the sparse data input layer, the dynamic feedback hidden layer and output layer, low dimensional results improve the intelligent navigation path planning.
Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

Directory of Open Access Journals (Sweden)

Holland Barbara R

2006-07-01

Full Text Available Abstract Background Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. Conclusion Using the most treelike distance matrices, as
Likelihood Inference of Nonlinear Models Based on a Class of Flexible Skewed Distributions

Directory of Open Access Journals (Sweden)

Xuedong Chen

2014-01-01

Full Text Available This paper deals with the issue of the likelihood inference for nonlinear models with a flexible skew-t-normal (FSTN distribution, which is proposed within a general framework of flexible skew-symmetric (FSS distributions by combining with skew-t-normal (STN distribution. In comparison with the common skewed distributions such as skew normal (SN, and skew-t (ST as well as scale mixtures of skew normal (SMSN, the FSTN distribution can accommodate more flexibility and robustness in the presence of skewed, heavy-tailed, especially multimodal outcomes. However, for this distribution, a usual approach of maximum likelihood estimates based on EM algorithm becomes unavailable and an alternative way is to return to the original Newton-Raphson type method. In order to improve the estimation as well as the way for confidence estimation and hypothesis test for the parameters of interest, a modified Newton-Raphson iterative algorithm is presented in this paper, based on profile likelihood for nonlinear regression models with FSTN distribution, and, then, the confidence interval and hypothesis test are also developed. Furthermore, a real example and simulation are conducted to demonstrate the usefulness and the superiority of our approach.
Progranulin mutation causes frontotemporal dementia in the Swedish Karolinska family.

Science.gov (United States)

Chiang, Huei-Hsin; Rosvall, Lina; Brohede, Jesper; Axelman, Karin; Björk, Behnosh F; Nennesmo, Inger; Robins, Tiina; Graff, Caroline

2008-11-01

Frontotemporal dementia (FTD) is a neurodegenerative disease characterized by cognitive impairment, language dysfunction, and/or changes in personality. Recently it has been shown that progranulin (GRN) mutations can cause FTD as well as other neurodegenerative phenotypes. DNA from 30 family members, of whom seven were diagnosed with FTD, in the Karolinska family was available for GRN sequencing. Fibroblast cell mRNA from one affected family member and six control individuals was available for relative quantitative real-time polymerase chain reaction to investigate the effect of the mutation. Furthermore, the cDNA of an affected individual was sequenced. Clinical and neuropathologic findings of a previously undescribed family branch are presented. A frameshift mutation in GRN (g.102delC) was detected in all affected family members and absent in four unaffected family members older than 70 years. Real-time polymerase chain reaction data showed an approximately 50% reduction of GRN fibroblast mRNA in an affected individual. The mutated mRNA transcripts were undetectable by cDNA sequencing. Segregation and RNA analyses showed that the g.102delC mutation, previously reported, causes FTD in the Karolinska family. Our findings add further support to the significance of GRN in FTD etiology and the presence of modifying genes, which emphasize the need for further studies into the mechanisms of clinical heterogeneity. However, the results already call for attention to the complexity of predictive genetic testing of GRN mutations.
Circulating progranulin as a biomarker for neurodegenerative diseases.

Science.gov (United States)

Ghidoni, Roberta; Paterlini, Anna; Benussi, Luisa

2012-01-01

Progranulin is a growth factor involved in the regulation of multiple processes including tumorigenesis, wound repair, development, and inflammation. The recent discovery that mutations in the gene encoding for progranulin (GRN) cause frontotemporal lobar degeneration (FTLD), and other neurodegenerative diseases leading to dementia, has brought renewed interest in progranulin and its functions in the central nervous system. GRN null mutations cause protein haploinsufficiency, leading to a significant decrease in progranulin levels that can be detected in plasma, serum and cerebrospinal fluid (CSF) of mutation carriers. The dosage of circulating progranulin sped up the identification of GRN mutations thus favoring genotype-phenotype correlation studies. Researchers demonstrated that, in GRN null mutation carriers, the shortage of progranulin invariably precedes clinical symptoms and thus mutation carriers are "captured" regardless of their disease status. GRN is a particularly appealing gene for drug targeting, in the way that boosting its expression may be beneficial for mutation carriers, preventing or delaying the onset of GRN-related neurodegenerative diseases. Physiological regulation of progranulin expression level is only partially known. Progranulin expression reflects mutation status and, intriguingly, its levels can be modulated by some additional factor (i.e. genetic background; drugs). Thus, factors increasing the production and secretion of progranulin from the normal gene are promising potential therapeutic avenues. In conclusion, peripheral progranulin is a nonintrusive highly accurate biomarker for early identification of mutation carriers and for monitoring future treatments that might boost the level of this protein.
Causal inference in economics and marketing.

Science.gov (United States)

Varian, Hal R

2016-07-05

This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.
Uncertainty in prediction and in inference

International Nuclear Information System (INIS)

Hilgevoord, J.; Uffink, J.

1991-01-01

The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close relationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in inference can be obtained by means of the so-called statistical distance between probability distributions. When applied to quantum mechanics, this distance leads to a measure of the distinguishability of quantum states, which essentially is the absolute value of the matrix element between the states. The importance of this result to the quantum mechanical uncertainty principle is noted. The second part of the paper provides a derivation of the statistical distance on the basis of the so-called method of support
Robust Inference of Population Structure for Ancestry Prediction and Correction of Stratification in the Presence of Relatedness

Science.gov (United States)

Conomos, Matthew P.; Miller, Mike; Thornton, Timothy

2016-01-01

Population structure inference with genetic data has been motivated by a variety of applications in population genetics and genetic association studies. Several approaches have been proposed for the identification of genetic ancestry differences in samples where study participants are assumed to be unrelated, including principal components analysis (PCA), multi-dimensional scaling (MDS), and model-based methods for proportional ancestry estimation. Many genetic studies, however, include individuals with some degree of relatedness, and existing methods for inferring genetic ancestry fail in related samples. We present a method, PC-AiR, for robust population structure inference in the presence of known or cryptic relatedness. PC-AiR utilizes genome-screen data and an efficient algorithm to identify a diverse subset of unrelated individuals that is representative of all ancestries in the sample. The PC-AiR method directly performs PCA on the identified ancestry representative subset and then predicts components of variation for all remaining individuals based on genetic similarities. In simulation studies and in applications to real data from Phase III of the HapMap Project, we demonstrate that PC-AiR provides a substantial improvement over existing approaches for population structure inference in related samples. We also demonstrate significant efficiency gains, where a single axis of variation from PC-AiR provides better prediction of ancestry in a variety of structure settings than using ten (or more) components of variation from widely used PCA and MDS approaches. Finally, we illustrate that PC-AiR can provide improved population stratification correction over existing methods in genetic association studies with population structure and relatedness. PMID:25810074

Some links on this page may take you to non-federal websites. Their policies may differ from this site.