link prediction algorithms: Topics by WorldWideScience.org

Sample records for link prediction algorithms

The Algorithm of Link Prediction on Social Network

Directory of Open Access Journals (Sweden)

Liyan Dong

2013-01-01

Full Text Available At present, most link prediction algorithms are based on the similarity between two entities. Social network topology information is one of the main sources to design the similarity function between entities. But the existing link prediction algorithms do not apply the network topology information sufficiently. For lack of traditional link prediction algorithms, we propose two improved algorithms: CNGF algorithm based on local information and KatzGF algorithm based on global information network. For the defect of the stationary of social network, we also provide the link prediction algorithm based on nodes multiple attributes information. Finally, we verified these algorithms on DBLP data set, and the experimental results show that the performance of the improved algorithm is superior to that of the traditional link prediction algorithm.
Predicting cryptic links in host-parasite networks.

Directory of Open Access Journals (Sweden)

Tad Dallas

2017-05-01

Full Text Available Networks are a way to represent interactions among one (e.g., social networks or more (e.g., plant-pollinator networks classes of nodes. The ability to predict likely, but unobserved, interactions has generated a great deal of interest, and is sometimes referred to as the link prediction problem. However, most studies of link prediction have focused on social networks, and have assumed a completely censused network. In biological networks, it is unlikely that all interactions are censused, and ignoring incomplete detection of interactions may lead to biased or incorrect conclusions. Previous attempts to predict network interactions have relied on known properties of network structure, making the approach sensitive to observation errors. This is an obvious shortcoming, as networks are dynamic, and sometimes not well sampled, leading to incomplete detection of links. Here, we develop an algorithm to predict missing links based on conditional probability estimation and associated, node-level features. We validate this algorithm on simulated data, and then apply it to a desert small mammal host-parasite network. Our approach achieves high accuracy on simulated and observed data, providing a simple method to accurately predict missing links in networks without relying on prior knowledge about network structure.
Link prediction with node clustering coefficient

Science.gov (United States)

Wu, Zhihao; Lin, Youfang; Wang, Jing; Gregory, Steve

2016-06-01

Predicting missing links in incomplete complex networks efficiently and accurately is still a challenging problem. The recently proposed Cannistrai-Alanis-Ravai (CAR) index shows the power of local link/triangle information in improving link-prediction accuracy. Inspired by the idea of employing local link/triangle information, we propose a new similarity index with more local structure information. In our method, local link/triangle structure information can be conveyed by clustering coefficient of common-neighbors directly. The reason why clustering coefficient has good effectiveness in estimating the contribution of a common-neighbor is that it employs links existing between neighbors of a common-neighbor and these links have the same structural position with the candidate link to this common-neighbor. In our experiments, three estimators: precision, AUP and AUC are used to evaluate the accuracy of link prediction algorithms. Experimental results on ten tested networks drawn from various fields show that our new index is more effective in predicting missing links than CAR index, especially for networks with low correlation between number of common-neighbors and number of links between common-neighbors.
Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks

Science.gov (United States)

Ma, Xiaoke; Sun, Penggang; Wang, Yu

2018-04-01

Many networks derived from society and nature are temporal and incomplete. The temporal link prediction problem in networks is to predict links at time T + 1 based on a given temporal network from time 1 to T, which is essential to important applications. The current algorithms either predict the temporal links by collapsing the dynamic networks or collapsing features derived from each network, which are criticized for ignoring the connection among slices. to overcome the issue, we propose a novel graph regularized nonnegative matrix factorization algorithm (GrNMF) for the temporal link prediction problem without collapsing the dynamic networks. To obtain the feature for each network from 1 to t, GrNMF factorizes the matrix associated with networks by setting the rest networks as regularization, which provides a better way to characterize the topological information of temporal links. Then, the GrNMF algorithm collapses the feature matrices to predict temporal links. Compared with state-of-the-art methods, the proposed algorithm exhibits significantly improved accuracy by avoiding the collapse of temporal networks. Experimental results of a number of artificial and real temporal networks illustrate that the proposed method is not only more accurate but also more robust than state-of-the-art approaches.
An efficient link prediction index for complex military organization

Science.gov (United States)

Fan, Changjun; Liu, Zhong; Lu, Xin; Xiu, Baoxin; Chen, Qing

2017-03-01

Quality of information is crucial for decision-makers to judge the battlefield situations and design the best operation plans, however, real intelligence data are often incomplete and noisy, where missing links prediction methods and spurious links identification algorithms can be applied, if modeling the complex military organization as the complex network where nodes represent functional units and edges denote communication links. Traditional link prediction methods usually work well on homogeneous networks, but few for the heterogeneous ones. And the military network is a typical heterogeneous network, where there are different types of nodes and edges. In this paper, we proposed a combined link prediction index considering both the nodes' types effects and nodes' structural similarities, and demonstrated that it is remarkably superior to all the 25 existing similarity-based methods both in predicting missing links and identifying spurious links in a real military network data; we also investigated the algorithms' robustness under noisy environment, and found the mistaken information is more misleading than incomplete information in military areas, which is different from that in recommendation systems, and our method maintained the best performance under the condition of small noise. Since the real military network intelligence must be carefully checked at first due to its significance, and link prediction methods are just adopted to purify the network with the left latent noise, the method proposed here is applicable in real situations. In the end, as the FINC-E model, here used to describe the complex military organizations, is also suitable to many other social organizations, such as criminal networks, business organizations, etc., thus our method has its prospects in these areas for many tasks, like detecting the underground relationships between terrorists, predicting the potential business markets for decision-makers, and so on.
Meta-path based heterogeneous combat network link prediction

Science.gov (United States)

Li, Jichao; Ge, Bingfeng; Yang, Kewei; Chen, Yingwu; Tan, Yuejin

2017-09-01

The combat system-of-systems in high-tech informative warfare, composed of many interconnected combat systems of different types, can be regarded as a type of complex heterogeneous network. Link prediction for heterogeneous combat networks (HCNs) is of significant military value, as it facilitates reconfiguring combat networks to represent the complex real-world network topology as appropriate with observed information. This paper proposes a novel integrated methodology framework called HCNMP (HCN link prediction based on meta-path) to predict multiple types of links simultaneously for an HCN. More specifically, the concept of HCN meta-paths is introduced, through which the HCNMP can accumulate information by extracting different features of HCN links for all the six defined types. Next, an HCN link prediction model, based on meta-path features, is built to predict all types of links of the HCN simultaneously. Then, the solution algorithm for the HCN link prediction model is proposed, in which the prediction results are obtained by iteratively updating with the newly predicted results until the results in the HCN converge or reach a certain maximum iteration number. Finally, numerical experiments on the dataset of a real HCN are conducted to demonstrate the feasibility and effectiveness of the proposed HCNMP, in comparison with 30 baseline methods. The results show that the performance of the HCNMP is superior to those of the baseline methods.
A novel time series link prediction method: Learning automata approach

Science.gov (United States)

Moradabadi, Behnaz; Meybodi, Mohammad Reza

2017-09-01

Link prediction is a main social network challenge that uses the network structure to predict future links. The common link prediction approaches to predict hidden links use a static graph representation where a snapshot of the network is analyzed to find hidden or future links. For example, similarity metric based link predictions are a common traditional approach that calculates the similarity metric for each non-connected link and sort the links based on their similarity metrics and label the links with higher similarity scores as the future links. Because people activities in social networks are dynamic and uncertainty, and the structure of the networks changes over time, using deterministic graphs for modeling and analysis of the social network may not be appropriate. In the time-series link prediction problem, the time series link occurrences are used to predict the future links In this paper, we propose a new time series link prediction based on learning automata. In the proposed algorithm for each link that must be predicted there is one learning automaton and each learning automaton tries to predict the existence or non-existence of the corresponding link. To predict the link occurrence in time T, there is a chain consists of stages 1 through T - 1 and the learning automaton passes from these stages to learn the existence or non-existence of the corresponding link. Our preliminary link prediction experiments with co-authorship and email networks have provided satisfactory results when time series link occurrences are considered.
Link prediction in multiplex online social networks

Science.gov (United States)

Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

2017-02-01

Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.
A class-based link prediction using Distance Dependent Chinese Restaurant Process

Science.gov (United States)

Andalib, Azam; Babamir, Seyed Morteza

2016-08-01

One of the important tasks in relational data analysis is link prediction which has been successfully applied on many applications such as bioinformatics, information retrieval, etc. The link prediction is defined as predicting the existence or absence of edges between nodes of a network. In this paper, we propose a novel method for link prediction based on Distance Dependent Chinese Restaurant Process (DDCRP) model which enables us to utilize the information of the topological structure of the network such as shortest path and connectivity of the nodes. We also propose a new Gibbs sampling algorithm for computing the posterior distribution of the hidden variables based on the training data. Experimental results on three real-world datasets show the superiority of the proposed method over other probabilistic models for link prediction problem.
Link-Prediction Enhanced Consensus Clustering for Complex Networks (Open Access)

Science.gov (United States)

2016-05-20

RESEARCH ARTICLE Link-Prediction Enhanced Consensus Clustering for Complex Networks Matthew Burgess1*, Eytan Adar1,2, Michael Cafarella1 1Computer...consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that...types of complex networks exhibit community structure: groups of highly connected nodes. Communities or clusters often reflect nodes that share similar
Named Entity Linking Algorithm

Directory of Open Access Journals (Sweden)

M. F. Panteleev

2017-01-01

Full Text Available In the tasks of processing text in natural language, Named Entity Linking (NEL represents the task to define and link some entity, which is found in the text, with some entity in the knowledge base (for example, Dbpedia. Currently, there is a diversity of approaches to solve this problem, but two main classes can be identified: graph-based approaches and machine learning-based ones. Graph and Machine Learning approaches-based algorithm is proposed accordingly to the stated assumptions about the interrelations of named entities in a sentence and in general.In the case of graph-based approaches, it is necessary to solve the problem of identifying an optimal set of the related entities according to some metric that characterizes the distance between these entities in a graph built on some knowledge base. Due to limitations in processing power, to solve this task directly is impossible. Therefore, its modification is proposed. Based on the algorithms of machine learning, an independent solution cannot be built due to small volumes of training datasets relevant to NEL task. However, their use can contribute to improving the quality of the algorithm. The adaptation of the Latent Dirichlet Allocation model is proposed in order to obtain a measure of the compatibility of attributes of various entities encountered in one context.The efficiency of the proposed algorithm was experimentally tested. A test dataset was independently generated. On its basis the performance of the model was compared using the proposed algorithm with the open source product DBpedia Spotlight, which solves the NEL problem.The mockup, based on the proposed algorithm, showed a low speed as compared to DBpedia Spotlight. However, the fact that it has shown higher accuracy, stipulates the prospects for work in this direction.The main directions of development were proposed in order to increase the accuracy of the system and its productivity.
Congested Link Inference Algorithms in Dynamic Routing IP Network

Directory of Open Access Journals (Sweden)

Yu Chen

2017-01-01

Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.
Link Prediction in Evolving Networks Based on Popularity of Nodes.

Science.gov (United States)

Wang, Tong; He, Xing-Sheng; Zhou, Ming-Yang; Fu, Zhong-Qian

2017-08-02

Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict missing edges or identify the spurious edges. The key issue of link prediction is to estimate the likelihood of potential links in networks. Most classical static-structure based methods ignore the temporal aspects of networks, limited by the time-varying features, such approaches perform poorly in evolving networks. In this paper, we propose a hypothesis that the ability of each node to attract links depends not only on its structural importance, but also on its current popularity (activeness), since active nodes have much more probability to attract future links. Then a novel approach named popularity based structural perturbation method (PBSPM) and its fast algorithm are proposed to characterize the likelihood of an edge from both existing connectivity structure and current popularity of its two endpoints. Experiments on six evolving networks show that the proposed methods outperform state-of-the-art methods in accuracy and robustness. Besides, visual results and statistical analysis reveal that the proposed methods are inclined to predict future edges between active nodes, rather than edges between inactive nodes.
Linking mothers and infants within electronic health records: a comparison of deterministic and probabilistic algorithms.

Science.gov (United States)

Baldwin, Eric; Johnson, Karin; Berthoud, Heidi; Dublin, Sascha

2015-01-01

To compare probabilistic and deterministic algorithms for linking mothers and infants within electronic health records (EHRs) to support pregnancy outcomes research. The study population was women enrolled in Group Health (Washington State, USA) delivering a liveborn infant from 2001 through 2008 (N = 33,093 deliveries) and infant members born in these years. We linked women to infants by surname, address, and dates of birth and delivery using deterministic and probabilistic algorithms. In a subset previously linked using "gold standard" identifiers (N = 14,449), we assessed each approach's sensitivity and positive predictive value (PPV). For deliveries with no "gold standard" linkage (N = 18,644), we compared the algorithms' linkage proportions. We repeated our analyses in an independent test set of deliveries from 2009 through 2013. We reviewed medical records to validate a sample of pairs apparently linked by one algorithm but not the other (N = 51 or 1.4% of discordant pairs). In the 2001-2008 "gold standard" population, the probabilistic algorithm's sensitivity was 84.1% (95% CI, 83.5-84.7) and PPV 99.3% (99.1-99.4), while the deterministic algorithm had sensitivity 74.5% (73.8-75.2) and PPV 95.7% (95.4-96.0). In the test set, the probabilistic algorithm again had higher sensitivity and PPV. For deliveries in 2001-2008 with no "gold standard" linkage, the probabilistic algorithm found matched infants for 58.3% and the deterministic algorithm, 52.8%. On medical record review, 100% of linked pairs appeared valid. A probabilistic algorithm improved linkage proportion and accuracy compared to a deterministic algorithm. Better linkage methods can increase the value of EHRs for pregnancy outcomes research. Copyright © 2014 John Wiley & Sons, Ltd.
An auxiliary optimization method for complex public transit route network based on link prediction

Science.gov (United States)

Zhang, Lin; Lu, Jian; Yue, Xianfei; Zhou, Jialin; Li, Yunxuan; Wan, Qian

2018-02-01

Inspired by the missing (new) link prediction and the spurious existing link identification in link prediction theory, this paper establishes an auxiliary optimization method for public transit route network (PTRN) based on link prediction. First, link prediction applied to PTRN is described, and based on reviewing the previous studies, the summary indices set and its algorithms set are collected for the link prediction experiment. Second, through analyzing the topological properties of Jinan’s PTRN established by the Space R method, we found that this is a typical small-world network with a relatively large average clustering coefficient. This phenomenon indicates that the structural similarity-based link prediction will show a good performance in this network. Then, based on the link prediction experiment of the summary indices set, three indices with maximum accuracy are selected for auxiliary optimization of Jinan’s PTRN. Furthermore, these link prediction results show that the overall layout of Jinan’s PTRN is stable and orderly, except for a partial area that requires optimization and reconstruction. The above pattern conforms to the general pattern of the optimal development stage of PTRN in China. Finally, based on the missing (new) link prediction and the spurious existing link identification, we propose optimization schemes that can be used not only to optimize current PTRN but also to evaluate PTRN planning.
Efficient network disintegration under incomplete information: the comic effect of link prediction

Science.gov (United States)

Tan, Suo-Yi; Wu, Jun; Lü, Linyuan; Li, Meng-Jun; Lu, Xin

2016-01-01

The study of network disintegration has attracted much attention due to its wide applications, including suppressing the epidemic spreading, destabilizing terrorist network, preventing financial contagion, controlling the rumor diffusion and perturbing cancer networks. The crux of this matter is to find the critical nodes whose removal will lead to network collapse. This paper studies the disintegration of networks with incomplete link information. An effective method is proposed to find the critical nodes by the assistance of link prediction techniques. Extensive experiments in both synthetic and real networks suggest that, by using link prediction method to recover partial missing links in advance, the method can largely improve the network disintegration performance. Besides, to our surprise, we find that when the size of missing information is relatively small, our method even outperforms than the results based on complete information. We refer to this phenomenon as the “comic effect” of link prediction, which means that the network is reshaped through the addition of some links that identified by link prediction algorithms, and the reshaped network is like an exaggerated but characteristic comic of the original one, where the important parts are emphasized. PMID:26960247
Efficient network disintegration under incomplete information: the comic effect of link prediction

Science.gov (United States)

Tan, Suo-Yi; Wu, Jun; Lü, Linyuan; Li, Meng-Jun; Lu, Xin

2016-03-01

The study of network disintegration has attracted much attention due to its wide applications, including suppressing the epidemic spreading, destabilizing terrorist network, preventing financial contagion, controlling the rumor diffusion and perturbing cancer networks. The crux of this matter is to find the critical nodes whose removal will lead to network collapse. This paper studies the disintegration of networks with incomplete link information. An effective method is proposed to find the critical nodes by the assistance of link prediction techniques. Extensive experiments in both synthetic and real networks suggest that, by using link prediction method to recover partial missing links in advance, the method can largely improve the network disintegration performance. Besides, to our surprise, we find that when the size of missing information is relatively small, our method even outperforms than the results based on complete information. We refer to this phenomenon as the “comic effect” of link prediction, which means that the network is reshaped through the addition of some links that identified by link prediction algorithms, and the reshaped network is like an exaggerated but characteristic comic of the original one, where the important parts are emphasized.
A link prediction method for heterogeneous networks based on BP neural network

Science.gov (United States)

Li, Ji-chao; Zhao, Dan-ling; Ge, Bing-Feng; Yang, Ke-Wei; Chen, Ying-Wu

2018-04-01

Most real-world systems, composed of different types of objects connected via many interconnections, can be abstracted as various complex heterogeneous networks. Link prediction for heterogeneous networks is of great significance for mining missing links and reconfiguring networks according to observed information, with considerable applications in, for example, friend and location recommendations and disease-gene candidate detection. In this paper, we put forward a novel integrated framework, called MPBP (Meta-Path feature-based BP neural network model), to predict multiple types of links for heterogeneous networks. More specifically, the concept of meta-path is introduced, followed by the extraction of meta-path features for heterogeneous networks. Next, based on the extracted meta-path features, a supervised link prediction model is built with a three-layer BP neural network. Then, the solution algorithm of the proposed link prediction model is put forward to obtain predicted results by iteratively training the network. Last, numerical experiments on the dataset of examples of a gene-disease network and a combat network are conducted to verify the effectiveness and feasibility of the proposed MPBP. It shows that the MPBP with very good performance is superior to the baseline methods.
Link adaptation algorithm for distributed coded transmissions in cooperative OFDMA systems

DEFF Research Database (Denmark)

Varga, Mihaly; Badiu, Mihai Alin; Bota, Vasile

2015-01-01

This paper proposes a link adaptation algorithm for cooperative transmissions in the down-link connection of an OFDMA-based wireless system. The algorithm aims at maximizing the spectral efficiency of a relay-aided communication link, while satisfying the block error rate constraints at both...... adaptation algorithm has linear complexity with the number of available resource blocks, while still provides a very good performance, as shown by simulation results....
An improved shuffled frog leaping algorithm based evolutionary framework for currency exchange rate prediction

Science.gov (United States)

Dash, Rajashree

2017-11-01

Forecasting purchasing power of one currency with respect to another currency is always an interesting topic in the field of financial time series prediction. Despite the existence of several traditional and computational models for currency exchange rate forecasting, there is always a need for developing simpler and more efficient model, which will produce better prediction capability. In this paper, an evolutionary framework is proposed by using an improved shuffled frog leaping (ISFL) algorithm with a computationally efficient functional link artificial neural network (CEFLANN) for prediction of currency exchange rate. The model is validated by observing the monthly prediction measures obtained for three currency exchange data sets such as USD/CAD, USD/CHF, and USD/JPY accumulated within same period of time. The model performance is also compared with two other evolutionary learning techniques such as Shuffled frog leaping algorithm and Particle Swarm optimization algorithm. Practical analysis of results suggest that, the proposed model developed using the ISFL algorithm with CEFLANN network is a promising predictor model for currency exchange rate prediction compared to other models included in the study.

An algorithm for link restoration of wavelength routing optical networks

DEFF Research Database (Denmark)

Limal, Emmanuel; Stubkjær, Kristian

1999-01-01

We present an algorithm for restoration of single link failure in wavelength routing multihop optical networks. The algorithm is based on an innovative study of networks using graph theory. It has the following original features: it (i) assigns working and spare channels simultaneously, (ii......) prevents the search for unacceptable routing paths by pointing out channels required for restoration, (iii) offers a high utilization of the capacity resources and (iv) allows a trivial search for the restoration paths. The algorithm is for link restoration of networks without wavelength translation. Its...
Optimal design of link systems using successive zooming genetic algorithm

Science.gov (United States)

Kwon, Young-Doo; Sohn, Chang-hyun; Kwon, Soon-Bum; Lim, Jae-gyoo

2009-07-01

Link-systems have been around for a long time and are still used to control motion in diverse applications such as automobiles, robots and industrial machinery. This study presents a procedure involving the use of a genetic algorithm for the optimal design of single four-bar link systems and a double four-bar link system used in diesel engine. We adopted the Successive Zooming Genetic Algorithm (SZGA), which has one of the most rapid convergence rates among global search algorithms. The results are verified by experiment and the Recurdyn dynamic motion analysis package. During the optimal design of single four-bar link systems, we found in the case of identical input/output (IO) angles that the initial and final configurations show certain symmetry. For the double link system, we introduced weighting factors for the multi-objective functions, which minimize the difference between output angles, providing balanced engine performance, as well as the difference between final output angle and the desired magnitudes of final output angle. We adopted a graphical method to select a proper ratio between the weighting factors.
Link Label Prediction in Signed Citation Network

KAUST Repository

Akujuobi, Uchenna

2016-04-12

Link label prediction is the problem of predicting the missing labels or signs of all the unlabeled edges in a network. For signed networks, these labels can either be positive or negative. In recent years, different algorithms have been proposed such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other edges are known. However, in most real-world social graphs, the number of labeled edges is usually less than that of unlabeled edges. Therefore, predicting a single edge label at a time would require multiple runs and is more computationally demanding. In this thesis, we look at link label prediction problem on a signed citation network with missing edge labels. Our citation network consists of papers from three major machine learning and data mining conferences together with their references, and edges showing the relationship between them. An edge in our network is labeled either positive (dataset relevant) if the reference is based on the dataset used in the paper or negative otherwise. We present three approaches to predict the missing labels. The first approach converts the label prediction problem into a standard classification problem. We then, generate a set of features for each edge and then adopt Support Vector Machines in solving the classification problem. For the second approach, we formalize the graph such that the edges are represented as nodes with links showing similarities between them. We then adopt a label propagation method to propagate the labels on known nodes to those with unknown labels. In the third approach, we adopt a PageRank approach where we rank the nodes according to the number of incoming positive and negative edges, after which we set a threshold. Based on the ranks, we can infer an edge would be positive if it goes a node above the
The integration of weighted human gene association networks based on link prediction.

Science.gov (United States)

Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing

2017-01-31

Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.
Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks

International Nuclear Information System (INIS)

Daminelli, Simone; Thomas, Josephine Maria; Durán, Claudio; Vittorio Cannistraci, Carlo

2015-01-01

Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but not within the two classes. Unveiling physical principles, building theories and suggesting physical models to predict bipartite links such as product-consumer connections in recommendation systems or drug–target interactions in molecular networks can provide priceless information to improve e-commerce or to accelerate pharmaceutical research. The prediction of nonobserved connections starting from those already present in the topology of a network is known as the link-prediction problem. It represents an important subject both in many-body interaction theory in physics and in new algorithms for applied tools in computer science. The rationale is that the existing connectivity structure of a network can suggest where new connections can appear with higher likelihood in an evolving network, or where nonobserved connections are missing in a partially known network. Surprisingly, current complex network theory presents a theoretical bottle-neck: a general framework for local-based link prediction directly in the bipartite domain is missing. Here, we overcome this theoretical obstacle and present a formal definition of common neighbour index and local-community-paradigm (LCP) for bipartite networks. As a consequence, we are able to introduce the first node-neighbourhood-based and LCP-based models for topological link prediction that utilize the bipartite domain. We performed link prediction evaluations in several networks of different size and of disparate origin, including technological, social and biological systems. Our models significantly improve topological prediction in many bipartite networks because they exploit local physical driving-forces that participate in the formation and organization of many real-world bipartite networks. Furthermore, we present a local-based formalism that allows to intuitively
Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks

Science.gov (United States)

Daminelli, Simone; Thomas, Josephine Maria; Durán, Claudio; Vittorio Cannistraci, Carlo

2015-11-01

Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but not within the two classes. Unveiling physical principles, building theories and suggesting physical models to predict bipartite links such as product-consumer connections in recommendation systems or drug-target interactions in molecular networks can provide priceless information to improve e-commerce or to accelerate pharmaceutical research. The prediction of nonobserved connections starting from those already present in the topology of a network is known as the link-prediction problem. It represents an important subject both in many-body interaction theory in physics and in new algorithms for applied tools in computer science. The rationale is that the existing connectivity structure of a network can suggest where new connections can appear with higher likelihood in an evolving network, or where nonobserved connections are missing in a partially known network. Surprisingly, current complex network theory presents a theoretical bottle-neck: a general framework for local-based link prediction directly in the bipartite domain is missing. Here, we overcome this theoretical obstacle and present a formal definition of common neighbour index and local-community-paradigm (LCP) for bipartite networks. As a consequence, we are able to introduce the first node-neighbourhood-based and LCP-based models for topological link prediction that utilize the bipartite domain. We performed link prediction evaluations in several networks of different size and of disparate origin, including technological, social and biological systems. Our models significantly improve topological prediction in many bipartite networks because they exploit local physical driving-forces that participate in the formation and organization of many real-world bipartite networks. Furthermore, we present a local-based formalism that allows to intuitively
Disrupting the Dissertation: Linked Data, Enhanced Publication and Algorithmic Culture

Science.gov (United States)

Tracy, Frances; Carmichael, Patrick

2017-01-01

This article explores how the three aspects of Striphas' notion of algorithmic culture (information, crowds and algorithms) might influence and potentially disrupt established educational practices. We draw on our experience of introducing semantic web and linked data technologies into higher education settings, focussing on extended student…
The alliance relationship analysis of international terrorist organizations with link prediction

Science.gov (United States)

Fang, Ling; Fang, Haiyang; Tian, Yanfang; Yang, Tinghong; Zhao, Jing

2017-09-01

Terrorism is a huge public hazard of the international community. Alliances of terrorist organizations may cause more serious threat to national security and world peace. Understanding alliances between global terrorist organizations will facilitate more effective anti-terrorism collaboration between governments. Based on publicly available data, this study constructed a alliance network between terrorist organizations and analyzed the alliance relationships with link prediction. We proposed a novel index based on optimal weighted fusion of six similarity indices, in which the optimal weight is calculated by genetic algorithm. Our experimental results showed that this algorithm could achieve better results on the networks than other algorithms. Using this method, we successfully digged out 21 real terrorist organizations alliance from current data. Our experiment shows that this approach used for terrorist organizations alliance mining is effective and this study is expected to benefit the form of a more powerful anti-terrorism strategy.
An algorithm for the diagnosis of X-linked intellectual disability in children

Directory of Open Access Journals (Sweden)

V. Yu. Voinova

2016-01-01

Full Text Available X-linked intellectual disability (XLID is a clinically and genetically heterogeneous group of hereditary diseases caused by mutations on the X chromosome, which lead to impaired intellectual development. The paper determines for the first time the proportion of X-linked diseases (6.54% in the pattern of intellectual disability in children. A system has been developed to quantify the clinical severity of fragile X mental retardation syndrome and Rett syndrome. A system has been scientifically justified to predict the clinical severity, which is based on an analysis of the impact of genetic and epigenetic factors (mutation type and location, X chromosome inactivation. The authors have determined the contribution of nonrandom X inactivation to the clinical polymorphism of various forms of XLID and established its role as an important diagnostic marker for pathology. It is shown that the study of X chromosome inactivation can identify asymptomatic female carriers of X-linked mutations to provide medical genetic counseling to families. An algorithm has been elaborated to diagnose XLID among the undifferentiated forms of mental developmental abnormalities in children.
Algorithms for Protein Structure Prediction

DEFF Research Database (Denmark)

Paluszewski, Martin

-trace. Here we present three different approaches for reconstruction of C-traces from predictable measures. In our first approach [63, 62], the C-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE......) is more robust than standard Monte Carlo search. In the second approach for reconstruction of C-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower...... bounds for partial structures very fast. Using these lower bounds, we are able to find global minimum structures in a huge conformational space in reasonable time. We show that many of these global minimum structures are of good quality compared to the native structure. Our branch and bound algorithm...
Community detection, link prediction, and layer interdependence in multilayer networks

Science.gov (United States)

De Bacco, Caterina; Power, Eleanor A.; Larremore, Daniel B.; Moore, Cristopher

2017-04-01

Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization algorithm, which allows us to perform inference tasks such as community detection and link prediction in this setting. Our model assumes overlapping communities that are common between the layers, while allowing these communities to affect each layer in a different way, including arbitrary mixtures of assortative, disassortative, or directed structure. It also gives us a mathematically principled way to define the interdependence between layers, by measuring how much information about one layer helps us predict links in another layer. In particular, this allows us to bundle layers together to compress redundant information and identify small groups of layers which suffice to predict the remaining layers accurately. We illustrate these findings by analyzing synthetic data and two real multilayer networks, one representing social support relationships among villagers in South India and the other representing shared genetic substring material between genes of the malaria parasite.
Improving local clustering based top-L link prediction methods via asymmetric link clustering information

Science.gov (United States)

Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan

2018-02-01

Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.
Evaluating and comparing algorithms for respiratory motion prediction

International Nuclear Information System (INIS)

Ernst, F; Dürichen, R; Schlaefer, A; Schweikard, A

2013-01-01

In robotic radiosurgery, it is necessary to compensate for systematic latencies arising from target tracking and mechanical constraints. This compensation is usually achieved by means of an algorithm which computes the future target position. In most scientific works on respiratory motion prediction, only one or two algorithms are evaluated on a limited amount of very short motion traces. The purpose of this work is to gain more insight into the real world capabilities of respiratory motion prediction methods by evaluating many algorithms on an unprecedented amount of data. We have evaluated six algorithms, the normalized least mean squares (nLMS), recursive least squares (RLS), multi-step linear methods (MULIN), wavelet-based multiscale autoregression (wLMS), extended Kalman filtering, and ε-support vector regression (SVRpred) methods, on an extensive database of 304 respiratory motion traces. The traces were collected during treatment with the CyberKnife (Accuray, Inc., Sunnyvale, CA, USA) and feature an average length of 71 min. Evaluation was done using a graphical prediction toolkit, which is available to the general public, as is the data we used. The experiments show that the nLMS algorithm—which is one of the algorithms currently used in the CyberKnife—is outperformed by all other methods. This is especially true in the case of the wLMS, the SVRpred, and the MULIN algorithms, which perform much better. The nLMS algorithm produces a relative root mean square (RMS) error of 75% or less (i.e., a reduction in error of 25% or more when compared to not doing prediction) in only 38% of the test cases, whereas the MULIN and SVRpred methods reach this level in more than 77%, the wLMS algorithm in more than 84% of the test cases. Our work shows that the wLMS algorithm is the most accurate algorithm and does not require parameter tuning, making it an ideal candidate for clinical implementation. Additionally, we have seen that the structure of a patient�
A new mutually reinforcing network node and link ranking algorithm.

Science.gov (United States)

Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E

2015-10-23

This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity.
A new mutually reinforcing network node and link ranking algorithm

Science.gov (United States)

Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E.

2015-10-01

This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity.
A new mutually reinforcing network node and link ranking algorithm

Science.gov (United States)

Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E.

2015-01-01

This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity. PMID:26492958
Link Prediction in Social Networks: the State-of-the-Art

OpenAIRE

Wang, Peng; Xu, Baowen; Wu, Yurong; Zhou, Xiaoyu

2014-01-01

In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems ...
Computationally efficient model predictive control algorithms a neural network approach

CERN Document Server

Ławryńczuk, Maciej

2014-01-01

This book thoroughly discusses computationally efficient (suboptimal) Model Predictive Control (MPC) techniques based on neural models. The subjects treated include: · A few types of suboptimal MPC algorithms in which a linear approximation of the model or of the predicted trajectory is successively calculated on-line and used for prediction. · Implementation details of the MPC algorithms for feedforward perceptron neural models, neural Hammerstein models, neural Wiener models and state-space neural models. · The MPC algorithms based on neural multi-models (inspired by the idea of predictive control). · The MPC algorithms with neural approximation with no on-line linearization. · The MPC algorithms with guaranteed stability and robustness. · Cooperation between the MPC algorithms and set-point optimization. Thanks to linearization (or neural approximation), the presented suboptimal algorithms do not require d...
Optimisation of the link volume for weakest link failure prediction in NBG-18 nuclear graphite

International Nuclear Information System (INIS)

Hindley, Michael P.; Groenwold, Albert A.; Blaine, Deborah C.; Becker, Thorsten H.

2014-01-01

This paper describes the process for approximating the optimal size of a link volume required for weakest link failure calculation in nuclear graphite, with NBG-18 used as an example. As part of the failure methodology, the link volume is defined in terms of two grouping criteria. The first criterion is a factor of the maximum grain size and the second criterion is a function of an equivalent stress limit. A methodology for approximating these grouping criteria is presented. The failure methodology employs finite element analysis (FEA) in order to predict the failure load, at 50% probability of failure. The average experimental failure load, as determined for 26 test geometries, is used to evaluate the accuracy of the weakest link failure calculations. The influence of the two grouping criteria on the failure load prediction is evaluated by defining an error in prediction across all test cases. Mathematical optimisation is used to find the minimum error across a range of test case failure predictions. This minimum error is shown to deliver the most accurate failure prediction across a whole range of components, although some test cases in the range predict conservative failure load. The mathematical optimisation objective function is penalised to account for non-conservative prediction of the failure load for any test case. The optimisation is repeated and a link volume found for conservative failure prediction. The failure prediction for each test case is evaluated, in detail, for the proposed link volumes. Based on the analysis, link design volumes for NBG-18 are recommended for either accurate or conservative failure prediction
Inverse kinematics algorithm for a six-link manipulator using a polynomial expression

International Nuclear Information System (INIS)

Sasaki, Shinobu

1987-01-01

This report is concerned with the forward and inverse kinematics problem relevant to a six-link robot manipulator. In order to derive the kinematic relationships between links, the vector rotation operator was applied instead of the conventional homogeneous transformation. The exact algorithm for solving the inverse problem was obtained by transforming kinematics equations into a polynomial. As shown in test calculations, the accuracies of numerical solutions obtained by means of the present approach are found to be quite high. The algorithm proposed permits to find out all feasible solutions for the given inverse problem. (author)

Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.

Science.gov (United States)

Petrotchenko, Evgeniy V; Borchers, Christoph H

2014-09-01

Cross-linking combined with MS involves enzymatic digestion of cross-linked proteins and identifying cross-linked peptides. Assignment of cross-linked peptide masses requires a search of all possible binary combinations of peptides from the cross-linked proteins' sequences, which becomes impractical with increasing complexity of the protein system and/or if digestion enzyme specificity is relaxed. Here, we describe the application of a fast sorting algorithm to search large sequence databases for cross-linked peptide assignments based on mass. This same algorithm has been used previously for assigning disulfide-bridged peptides (Choi et al., ), but has not previously been applied to cross-linking studies. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Predicting Hidden Links in Supply Networks

Directory of Open Access Journals (Sweden)

A. Brintrup

2018-01-01

Full Text Available Manufacturing companies often lack visibility of the procurement interdependencies between the suppliers within their supply network. However, knowledge of these interdependencies is useful to plan for potential operational disruptions. In this paper, we develop the Supply Network Link Predictor (SNLP method to infer supplier interdependencies using the manufacturer’s incomplete knowledge of the network. SNLP uses topological data to extract relational features from the known network to train a classifier for predicting potential links. Using a test case from the automotive industry, four features are extracted: (i number of existing supplier links, (ii overlaps between supplier product portfolios, (iii product outsourcing associations, and (iv likelihood of buyers purchasing from two suppliers together. Naïve Bayes and Logistic Regression are then employed to predict whether these features can help predict interdependencies between two suppliers. Our results show that these features can indeed be used to predict interdependencies in the network and that predictive accuracy is maximised by (i and (iii. The findings give rise to the exciting possibility of using data analytics for improving supply chain visibility. We then proceed to discuss to what extent such approaches can be adopted and their limitations, highlighting next steps for future work in this area.
Evaluating ortholog prediction algorithms in a yeast model clade.

Directory of Open Access Journals (Sweden)

Leonidas Salichos

Full Text Available BACKGROUND: Accurate identification of orthologs is crucial for evolutionary studies and for functional annotation. Several algorithms have been developed for ortholog delineation, but so far, manually curated genome-scale biological databases of orthologous genes for algorithm evaluation have been lacking. We evaluated four popular ortholog prediction algorithms (MultiParanoid; and OrthoMCL; RBH: Reciprocal Best Hit; RSD: Reciprocal Smallest Distance; the last two extended into clustering algorithms cRBH and cRSD, respectively, so that they can predict orthologs across multiple taxa against a set of 2,723 groups of high-quality curated orthologs from 6 Saccharomycete yeasts in the Yeast Gene Order Browser. RESULTS: Examination of sensitivity [TP/(TP+FN], specificity [TN/(TN+FP], and accuracy [(TP+TN/(TP+TN+FP+FN] across a broad parameter range showed that cRBH was the most accurate and specific algorithm, whereas OrthoMCL was the most sensitive. Evaluation of the algorithms across a varying number of species showed that cRBH had the highest accuracy and lowest false discovery rate [FP/(FP+TP], followed by cRSD. Of the six species in our set, three descended from an ancestor that underwent whole genome duplication. Subsequent differential duplicate loss events in the three descendants resulted in distinct classes of gene loss patterns, including cases where the genes retained in the three descendants are paralogs, constituting 'traps' for ortholog prediction algorithms. We found that the false discovery rate of all algorithms dramatically increased in these traps. CONCLUSIONS: These results suggest that simple algorithms, like cRBH, may be better ortholog predictors than more complex ones (e.g., OrthoMCL and MultiParanoid for evolutionary and functional genomics studies where the objective is the accurate inference of single-copy orthologs (e.g., molecular phylogenetics, but that all algorithms fail to accurately predict orthologs when paralogy
Novel prediction- and subblock-based algorithm for fractal image compression

International Nuclear Information System (INIS)

Chung, K.-L.; Hsu, C.-H.

2006-01-01

Fractal encoding is the most consuming part in fractal image compression. In this paper, a novel two-phase prediction- and subblock-based fractal encoding algorithm is presented. Initially the original gray image is partitioned into a set of variable-size blocks according to the S-tree- and interpolation-based decomposition principle. In the first phase, each current block of variable-size range block tries to find the best matched domain block based on the proposed prediction-based search strategy which utilizes the relevant neighboring variable-size domain blocks. The first phase leads to a significant computation-saving effect. If the domain block found within the predicted search space is unacceptable, in the second phase, a subblock strategy is employed to partition the current variable-size range block into smaller blocks to improve the image quality. Experimental results show that our proposed prediction- and subblock-based fractal encoding algorithm outperforms the conventional full search algorithm and the recently published spatial-correlation-based algorithm by Truong et al. in terms of encoding time and image quality. In addition, the performance comparison among our proposed algorithm and the other two algorithms, the no search-based algorithm and the quadtree-based algorithm, are also investigated
Enhanced clinical pharmacy service targeting tools: risk-predictive algorithms.

Science.gov (United States)

El Hajji, Feras W D; Scullin, Claire; Scott, Michael G; McElnay, James C

2015-04-01

This study aimed to determine the value of using a mix of clinical pharmacy data and routine hospital admission spell data in the development of predictive algorithms. Exploration of risk factors in hospitalized patients, together with the targeting strategies devised, will enable the prioritization of clinical pharmacy services to optimize patient outcomes. Predictive algorithms were developed using a number of detailed steps using a 75% sample of integrated medicines management (IMM) patients, and validated using the remaining 25%. IMM patients receive targeted clinical pharmacy input throughout their hospital stay. The algorithms were applied to the validation sample, and predicted risk probability was generated for each patient from the coefficients. Risk threshold for the algorithms were determined by identifying the cut-off points of risk scores at which the algorithm would have the highest discriminative performance. Clinical pharmacy staffing levels were obtained from the pharmacy department staffing database. Numbers of previous emergency admissions and admission medicines together with age-adjusted co-morbidity and diuretic receipt formed a 12-month post-discharge and/or readmission risk algorithm. Age-adjusted co-morbidity proved to be the best index to predict mortality. Increased numbers of clinical pharmacy staff at ward level was correlated with a reduction in risk-adjusted mortality index (RAMI). Algorithms created were valid in predicting risk of in-hospital and post-discharge mortality and risk of hospital readmission 3, 6 and 12 months post-discharge. The provision of ward-based clinical pharmacy services is a key component to reducing RAMI and enabling the full benefits of pharmacy input to patient care to be realized. © 2014 John Wiley & Sons, Ltd.
A comprehensive comparison of network similarities for link prediction and spurious link elimination

Science.gov (United States)

Zhang, Peng; Qiu, Dan; Zeng, An; Xiao, Jinghua

2018-06-01

Identifying missing interactions in complex networks, known as link prediction, is realized by estimating the likelihood of the existence of a link between two nodes according to the observed links and nodes' attributes. Similar approaches have also been employed to identify and remove spurious links in networks which is crucial for improving the reliability of network data. In network science, the likelihood for two nodes having a connection strongly depends on their structural similarity. The key to address these two problems thus becomes how to objectively measure the similarity between nodes in networks. In the literature, numerous network similarity metrics have been proposed and their accuracy has been discussed independently in previous works. In this paper, we systematically compare the accuracy of 18 similarity metrics in both link prediction and spurious link elimination when the observed networks are very sparse or consist of inaccurate linking information. Interestingly, some methods have high prediction accuracy, they tend to perform low accuracy in identification spurious interaction. We further find that methods can be classified into several cluster according to their behaviors. This work is useful for guiding future use of these similarity metrics for different purposes.
Hybrid Monte Carlo algorithm with fat link fermion actions

International Nuclear Information System (INIS)

Kamleh, Waseem; Leinweber, Derek B.; Williams, Anthony G.

2004-01-01

The use of APE smearing or other blocking techniques in lattice fermion actions can provide many advantages. There are many variants of these fat link actions in lattice QCD currently, such as flat link irrelevant clover (FLIC) fermions. The FLIC fermion formalism makes use of the APE blocking technique in combination with a projection of the blocked links back into the special unitary group. This reunitarization is often performed using an iterative maximization of a gauge invariant measure. This technique is not differentiable with respect to the gauge field and thus prevents the use of standard Hybrid Monte Carlo simulation algorithms. The use of an alternative projection technique circumvents this difficulty and allows the simulation of dynamical fat link fermions with standard HMC and its variants. The necessary equations of motion for FLIC fermions are derived, and some initial simulation results are presented. The technique is more general however, and is straightforwardly applicable to other smearing techniques or fat link actions
Prediction models and control algorithms for predictive applications of setback temperature in cooling systems

International Nuclear Information System (INIS)

Moon, Jin Woo; Yoon, Younju; Jeon, Young-Hoon; Kim, Sooyoung

2017-01-01

Highlights: • Initial ANN model was developed for predicting the time to the setback temperature. • Initial model was optimized for producing accurate output. • Optimized model proved its prediction accuracy. • ANN-based algorithms were developed and tested their performance. • ANN-based algorithms presented superior thermal comfort or energy efficiency. - Abstract: In this study, a temperature control algorithm was developed to apply a setback temperature predictively for the cooling system of a residential building during occupied periods by residents. An artificial neural network (ANN) model was developed to determine the required time for increasing the current indoor temperature to the setback temperature. This study involved three phases: development of the initial ANN-based prediction model, optimization and testing of the initial model, and development and testing of three control algorithms. The development and performance testing of the model and algorithm were conducted using TRNSYS and MATLAB. Through the development and optimization process, the final ANN model employed indoor temperature and the temperature difference between the current and target setback temperature as two input neurons. The optimal number of hidden layers, number of neurons, learning rate, and moment were determined to be 4, 9, 0.6, and 0.9, respectively. The tangent–sigmoid and pure-linear transfer function was used in the hidden and output neurons, respectively. The ANN model used 100 training data sets with sliding-window method for data management. Levenberg-Marquart training method was employed for model training. The optimized model had a prediction accuracy of 0.9097 root mean square errors when compared with the simulated results. Employing the ANN model, ANN-based algorithms maintained indoor temperatures better within target ranges. Compared to the conventional algorithm, the ANN-based algorithms reduced the duration of time, in which the indoor temperature
A Traffic Prediction Algorithm for Street Lighting Control Efficiency

Directory of Open Access Journals (Sweden)

POPA Valentin

2013-01-01

Full Text Available This paper presents the development of a traffic prediction algorithm that can be integrated in a street lighting monitoring and control system. The prediction algorithm must enable the reduction of energy costs and improve energy efficiency by decreasing the light intensity depending on the traffic level. The algorithm analyses and processes the information received at the command center based on the traffic level at different moments. The data is collected by means of the Doppler vehicle detection sensors integrated within the system. Thus, two methods are used for the implementation of the algorithm: a neural network and a k-NN (k-Nearest Neighbor prediction algorithm. For 500 training cycles, the mean square error of the neural network is 9.766 and for 500.000 training cycles the error amounts to 0.877. In case of the k-NN algorithm the error increases from 8.24 for k=5 to 12.27 for a number of 50 neighbors. In terms of a root means square error parameter, the use of a neural network ensures the highest performance level and can be integrated in a street lighting control system.
Application of XGBoost algorithm in hourly PM2.5 concentration prediction

Science.gov (United States)

Pan, Bingyue

2018-02-01

In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.
Effectiveness of link prediction for face-to-face behavioral networks.

Science.gov (United States)

Tsugawa, Sho; Ohsaki, Hiroyuki

2013-01-01

Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30-0.45 and a recall of 0.10-0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks.
Improved hybrid optimization algorithm for 3D protein structure prediction.

Science.gov (United States)

Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

2014-07-01

A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.
Which clustering algorithm is better for predicting protein complexes?

Directory of Open Access Journals (Sweden)

Moschopoulos Charalampos N

2011-12-01

Full Text Available Abstract Background Protein-Protein interactions (PPI play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H and Tandem Affinity Purification (TAP methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm
Adaptive Outlier-tolerant Exponential Smoothing Prediction Algorithms with Applications to Predict the Temperature in Spacecraft

OpenAIRE

Hu Shaolin; Zhang Wei; Li Ye; Fan Shunxi

2011-01-01

The exponential smoothing prediction algorithm is widely used in spaceflight control and in process monitoring as well as in economical prediction. There are two key conundrums which are open: one is about the selective rule of the parameter in the exponential smoothing prediction, and the other is how to improve the bad influence of outliers on prediction. In this paper a new practical outlier-tolerant algorithm is built to select adaptively proper parameter, and the exponential smoothing pr...
Comparison of predictive performance of data mining algorithms in predicting body weight in Mengali rams of Pakistan

Directory of Open Access Journals (Sweden)

Senol Celik

Full Text Available ABSTRACT The present study aimed at comparing predictive performance of some data mining algorithms (CART, CHAID, Exhaustive CHAID, MARS, MLP, and RBF in biometrical data of Mengali rams. To compare the predictive capability of the algorithms, the biometrical data regarding body (body length, withers height, and heart girth and testicular (testicular length, scrotal length, and scrotal circumference measurements of Mengali rams in predicting live body weight were evaluated by most goodness of fit criteria. In addition, age was considered as a continuous independent variable. In this context, MARS data mining algorithm was used for the first time to predict body weight in two forms, without (MARS_1 and with interaction (MARS_2 terms. The superiority order in the predictive accuracy of the algorithms was found as CART > CHAID ≈ Exhaustive CHAID > MARS_2 > MARS_1 > RBF > MLP. Moreover, all tested algorithms provided a strong predictive accuracy for estimating body weight. However, MARS is the only algorithm that generated a prediction equation for body weight. Therefore, it is hoped that the available results might present a valuable contribution in terms of predicting body weight and describing the relationship between the body weight and body and testicular measurements in revealing breed standards and the conservation of indigenous gene sources for Mengali sheep breeding. Therefore, it will be possible to perform more profitable and productive sheep production. Use of data mining algorithms is useful for revealing the relationship between body weight and testicular traits in describing breed standards of Mengali sheep.
Link prediction boosted psychiatry disorder classification for functional connectivity network

Science.gov (United States)

Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang

2017-02-01

Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.
Predicting Students’ Performance using Modified ID3 Algorithm

OpenAIRE

Ramanathan L; Saksham Dhanda; Suresh Kumar D

2013-01-01

The ability to predict performance of students is very crucial in our present education system. We can use data mining concepts for this purpose. ID3 algorithm is one of the famous algorithms present today to generate decision trees. But this algorithm has a shortcoming that it is inclined to attributes with many values. So , this research aims to overcome this shortcoming of the algorithm by using gain ratio(instead of information gain) as well as by giving weights to each attribute at every...
Machine learning algorithms for datasets popularity prediction

CERN Document Server

Kancys, Kipras

2016-01-01

This report represents continued study where ML algorithms were used to predict databases popularity. Three topics were covered. First of all, there was a discrepancy between old and new meta-data collection procedures, so a reason for that had to be found. Secondly, different parameters were analysed and dropped to make algorithms perform better. And third, it was decided to move modelling part on Spark.
A recurrence-weighted prediction algorithm for musical analysis

Science.gov (United States)

Colucci, Renato; Leguizamon Cucunuba, Juan Sebastián; Lloyd, Simon

2018-03-01

Forecasting the future behaviour of a system using past data is an important topic. In this article we apply nonlinear time series analysis in the context of music, and present new algorithms for extending a sample of music, while maintaining characteristics similar to the original piece. By using ideas from ergodic theory, we adapt the classical prediction method of Lorenz analogues so as to take into account recurrence times, and demonstrate with examples, how the new algorithm can produce predictions with a high degree of similarity to the original sample.
Gas Emission Prediction Model of Coal Mine Based on CSBP Algorithm

Directory of Open Access Journals (Sweden)

Xiong Yan

2016-01-01

Full Text Available In view of the nonlinear characteristics of gas emission in a coal working face, a prediction method is proposed based on cuckoo search algorithm optimized BP neural network (CSBP. In the CSBP algorithm, the cuckoo search is adopted to optimize weight and threshold parameters of BP network, and obtains the global optimal solutions. Furthermore, the twelve main affecting factors of the gas emission in the coal working face are taken as input vectors of CSBP algorithm, the gas emission is acted as output vector, and then the prediction model of BP neural network with optimal parameters is established. The results show that the CSBP algorithm has batter generalization ability and higher prediction accuracy, and can be utilized effectively in the prediction of coal mine gas emission.

A range-based predictive localization algorithm for WSID networks

Science.gov (United States)

Liu, Yuan; Chen, Junjie; Li, Gang

2017-11-01

Most studies on localization algorithms are conducted on the sensor networks with densely distributed nodes. However, the non-localizable problems are prone to occur in the network with sparsely distributed sensor nodes. To solve this problem, a range-based predictive localization algorithm (RPLA) is proposed in this paper for the wireless sensor networks syncretizing the RFID (WSID) networks. The Gaussian mixture model is established to predict the trajectory of a mobile target. Then, the received signal strength indication is used to reduce the residence area of the target location based on the approximate point-in-triangulation test algorithm. In addition, collaborative localization schemes are introduced to locate the target in the non-localizable situations. Simulation results verify that the RPLA achieves accurate localization for the network with sparsely distributed sensor nodes. The localization accuracy of the RPLA is 48.7% higher than that of the APIT algorithm, 16.8% higher than that of the single Gaussian model-based algorithm and 10.5% higher than that of the Kalman filtering-based algorithm.
Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo.

Science.gov (United States)

Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

2016-01-28

The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.
Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo

Science.gov (United States)

Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

2016-01-01

The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.
Link prediction via generalized coupled tensor factorisation

DEFF Research Database (Denmark)

Ermiş, Beyza; Evrim, Acar Ataman; Taylan Cemgil, A.

2012-01-01

and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with com- mon latent factors using...... different loss functions. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links....
CAT-PUMA: CME Arrival Time Prediction Using Machine learning Algorithms

Science.gov (United States)

Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert

2018-04-01

CAT-PUMA (CME Arrival Time Prediction Using Machine learning Algorithms) quickly and accurately predicts the arrival of Coronal Mass Ejections (CMEs) of CME arrival time. The software was trained via detailed analysis of CME features and solar wind parameters using 182 previously observed geo-effective partial-/full-halo CMEs and uses algorithms of the Support Vector Machine (SVM) to make its predictions, which can be made within minutes of providing the necessary input parameters of a CME.
Link Label Prediction in Signed Citation Network

KAUST Repository

Akujuobi, Uchenna Thankgod

2016-01-01

such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other
Fast prediction of RNA-RNA interaction using heuristic algorithm.

Science.gov (United States)

Montaseri, Soheila

2015-01-01

Interaction between two RNA molecules plays a crucial role in many medical and biological processes such as gene expression regulation. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. Some algorithms have been formed to predict the structure of the RNA-RNA interaction. High computational time is a common challenge in most of the presented algorithms. In this context, a heuristic method is introduced to accurately predict the interaction between two RNAs based on minimum free energy (MFE). This algorithm uses a few dot matrices for finding the secondary structure of each RNA and binding sites between two RNAs. Furthermore, a parallel version of this method is presented. We describe the algorithm's concurrency and parallelism for a multicore chip. The proposed algorithm has been performed on some datasets including CopA-CopT, R1inv-R2inv, Tar-Tar*, DIS-DIS, and IncRNA54-RepZ in Escherichia coli bacteria. The method has high validity and efficiency, and it is run in low computational time in comparison to other approaches.
Accurate Prediction of Coronary Artery Disease Using Bioinformatics Algorithms

Directory of Open Access Journals (Sweden)

Hajar Shafiee

2016-06-01

Full Text Available Background and Objectives: Cardiovascular disease is one of the main causes of death in developed and Third World countries. According to the statement of the World Health Organization, it is predicted that death due to heart disease will rise to 23 million by 2030. According to the latest statistics reported by Iran’s Minister of health, 3.39% of all deaths are attributed to cardiovascular diseases and 19.5% are related to myocardial infarction. The aim of this study was to predict coronary artery disease using data mining algorithms. Methods: In this study, various bioinformatics algorithms, such as decision trees, neural networks, support vector machines, clustering, etc., were used to predict coronary heart disease. The data used in this study was taken from several valid databases (including 14 data. Results: In this research, data mining techniques can be effectively used to diagnose different diseases, including coronary artery disease. Also, for the first time, a prediction system based on support vector machine with the best possible accuracy was introduced. Conclusion: The results showed that among the features, thallium scan variable is the most important feature in the diagnosis of heart disease. Designation of machine prediction models, such as support vector machine learning algorithm can differentiate between sick and healthy individuals with 100% accuracy.
An algorithm to discover gene signatures with predictive potential

Directory of Open Access Journals (Sweden)

Hallett Robin M

2010-09-01

Full Text Available Abstract Background The advent of global gene expression profiling has generated unprecedented insight into our molecular understanding of cancer, including breast cancer. For example, human breast cancer patients display significant diversity in terms of their survival, recurrence, metastasis as well as response to treatment. These patient outcomes can be predicted by the transcriptional programs of their individual breast tumors. Predictive gene signatures allow us to correctly classify human breast tumors into various risk groups as well as to more accurately target therapy to ensure more durable cancer treatment. Results Here we present a novel algorithm to generate gene signatures with predictive potential. The method first classifies the expression intensity for each gene as determined by global gene expression profiling as low, average or high. The matrix containing the classified data for each gene is then used to score the expression of each gene based its individual ability to predict the patient characteristic of interest. Finally, all examined genes are ranked based on their predictive ability and the most highly ranked genes are included in the master gene signature, which is then ready for use as a predictor. This method was used to accurately predict the survival outcomes in a cohort of human breast cancer patients. Conclusions We confirmed the capacity of our algorithm to generate gene signatures with bona fide predictive ability. The simplicity of our algorithm will enable biological researchers to quickly generate valuable gene signatures without specialized software or extensive bioinformatics training.
A Wavelet Analysis-Based Dynamic Prediction Algorithm to Network Traffic

Directory of Open Access Journals (Sweden)

Meng Fan-Bo

2016-01-01

Full Text Available Network traffic is a significantly important parameter for network traffic engineering, while it holds highly dynamic nature in the network. Accordingly, it is difficult and impossible to directly predict traffic amount of end-to-end flows. This paper proposes a new prediction algorithm to network traffic using the wavelet analysis. Firstly, network traffic is converted into the time-frequency domain to capture time-frequency feature of network traffic. Secondly, in different frequency components, we model network traffic in the time-frequency domain. Finally, we build the prediction model about network traffic. At the same time, the corresponding prediction algorithm is presented to attain network traffic prediction. Simulation results indicates that our approach is promising.
A utility/cost analysis of breast cancer risk prediction algorithms

Science.gov (United States)

Abbey, Craig K.; Wu, Yirong; Burnside, Elizabeth S.; Wunderlich, Adam; Samuelson, Frank W.; Boone, John M.

2016-03-01

Breast cancer risk prediction algorithms are used to identify subpopulations that are at increased risk for developing breast cancer. They can be based on many different sources of data such as demographics, relatives with cancer, gene expression, and various phenotypic features such as breast density. Women who are identified as high risk may undergo a more extensive (and expensive) screening process that includes MRI or ultrasound imaging in addition to the standard full-field digital mammography (FFDM) exam. Given that there are many ways that risk prediction may be accomplished, it is of interest to evaluate them in terms of expected cost, which includes the costs of diagnostic outcomes. In this work we perform an expected-cost analysis of risk prediction algorithms that is based on a published model that includes the costs associated with diagnostic outcomes (true-positive, false-positive, etc.). We assume the existence of a standard screening method and an enhanced screening method with higher scan cost, higher sensitivity, and lower specificity. We then assess expected cost of using a risk prediction algorithm to determine who gets the enhanced screening method under the strong assumption that risk and diagnostic performance are independent. We find that if risk prediction leads to a high enough positive predictive value, it will be cost-effective regardless of the size of the subpopulation. Furthermore, in terms of the hit-rate and false-alarm rate of the of the risk prediction algorithm, iso-cost contours are lines with slope determined by properties of the available diagnostic systems for screening.
Efficient predictive algorithms for image compression

CERN Document Server

Rosário Lucas, Luís Filipe; Maciel de Faria, Sérgio Manuel; Morais Rodrigues, Nuno Miguel; Liberal Pagliari, Carla

2017-01-01

This book discusses efficient prediction techniques for the current state-of-the-art High Efficiency Video Coding (HEVC) standard, focusing on the compression of a wide range of video signals, such as 3D video, Light Fields and natural images. The authors begin with a review of the state-of-the-art predictive coding methods and compression technologies for both 2D and 3D multimedia contents, which provides a good starting point for new researchers in the field of image and video compression. New prediction techniques that go beyond the standardized compression technologies are then presented and discussed. In the context of 3D video, the authors describe a new predictive algorithm for the compression of depth maps, which combines intra-directional prediction, with flexible block partitioning and linear residue fitting. New approaches are described for the compression of Light Field and still images, which enforce sparsity constraints on linear models. The Locally Linear Embedding-based prediction method is in...
Energy Link Optimization in a Wireless Power Transfer Grid under Energy Autonomy Based on the Improved Genetic Algorithm

Directory of Open Access Journals (Sweden)

Zhihao Zhao

2016-08-01

Full Text Available In this paper, an optimization method is proposed for the energy link in a wireless power transfer grid, which is a regional smart microgrid comprised of distributed devices equipped with wireless power transfer technology in a certain area. The relevant optimization model of the energy link is established by considering the wireless power transfer characteristics and the grid characteristics brought in by the device repeaters. Then, a concentration adaptive genetic algorithm (CAGA is proposed to optimize the energy link. The algorithm avoided the unification trend by introducing the concentration mechanism and a new crossover method named forward order crossover, as well as the adaptive parameter mechanism, which are utilized together to keep the diversity of the optimization solution groups. The results show that CAGA is feasible and competitive for the energy link optimization in different situations. This proposed algorithm performs better than its counterparts in the global convergence ability and the algorithm robustness.
Appendix F. Developmental enforcement algorithm definition document : predictive braking enforcement algorithm definition document.

Science.gov (United States)

2012-05-01

The purpose of this document is to fully define and describe the logic flow and mathematical equations for a predictive braking enforcement algorithm intended for implementation in a Positive Train Control (PTC) system.
Link Prediction via Convex Nonnegative Matrix Factorization on Multiscale Blocks

Directory of Open Access Journals (Sweden)

Enming Dong

2014-01-01

Full Text Available Low rank matrices approximations have been used in link prediction for networks, which are usually global optimal methods and lack of using the local information. The block structure is a significant local feature of matrices: entities in the same block have similar values, which implies that links are more likely to be found within dense blocks. We use this insight to give a probabilistic latent variable model for finding missing links by convex nonnegative matrix factorization with block detection. The experiments show that this method gives better prediction accuracy than original method alone. Different from the original low rank matrices approximations methods for link prediction, the sparseness of solutions is in accord with the sparse property for most real complex networks. Scaling to massive size network, we use the block information mapping matrices onto distributed architectures and give a divide-and-conquer prediction method. The experiments show that it gives better results than common neighbors method when the networks have a large number of missing links.
Performance of local information-based link prediction: a sampling perspective

Science.gov (United States)

Zhao, Jichang; Feng, Xu; Dong, Li; Liang, Xiao; Xu, Ke

2012-08-01

Link prediction is pervasively employed to uncover the missing links in the snapshots of real-world networks, which are usually obtained through different kinds of sampling methods. In the previous literature, in order to evaluate the performance of the prediction, known edges in the sampled snapshot are divided into the training set and the probe set randomly, without considering the underlying sampling approaches. However, different sampling methods might lead to different missing links, especially for the biased ways. For this reason, random partition-based evaluation of performance is no longer convincing if we take the sampling method into account. In this paper, we try to re-evaluate the performance of local information-based link predictions through sampling method governed division of the training set and the probe set. It is interesting that we find that for different sampling methods, each prediction approach performs unevenly. Moreover, most of these predictions perform weakly when the sampling method is biased, which indicates that the performance of these methods might have been overestimated in the prior works.
Real coded genetic algorithm for fuzzy time series prediction

Science.gov (United States)

Jain, Shilpa; Bisht, Dinesh C. S.; Singh, Phool; Mathpal, Prakash C.

2017-10-01

Genetic Algorithm (GA) forms a subset of evolutionary computing, rapidly growing area of Artificial Intelligence (A.I.). Some variants of GA are binary GA, real GA, messy GA, micro GA, saw tooth GA, differential evolution GA. This research article presents a real coded GA for predicting enrollments of University of Alabama. Data of Alabama University is a fuzzy time series. Here, fuzzy logic is used to predict enrollments of Alabama University and genetic algorithm optimizes fuzzy intervals. Results are compared to other eminent author works and found satisfactory, and states that real coded GA are fast and accurate.
Fuzzy model predictive control algorithm applied in nuclear power plant

International Nuclear Information System (INIS)

Zuheir, Ahmad

2006-01-01

The aim of this paper is to design a predictive controller based on a fuzzy model. The Takagi-Sugeno fuzzy model with an Adaptive B-splines neuro-fuzzy implementation is used and incorporated as a predictor in a predictive controller. An optimization approach with a simplified gradient technique is used to calculate predictions of the future control actions. In this approach, adaptation of the fuzzy model using dynamic process information is carried out to build the predictive controller. The easy description of the fuzzy model and the easy computation of the gradient sector during the optimization procedure are the main advantages of the computation algorithm. The algorithm is applied to the control of a U-tube steam generation unit (UTSG) used for electricity generation. (author)
Development of a Thermal Equilibrium Prediction Algorithm

International Nuclear Information System (INIS)

Aviles-Ramos, Cuauhtemoc

2002-01-01

A thermal equilibrium prediction algorithm is developed and tested using a heat conduction model and data sets from calorimetric measurements. The physical model used in this study is the exact solution of a system of two partial differential equations that govern the heat conduction in the calorimeter. A multi-parameter estimation technique is developed and implemented to estimate the effective volumetric heat generation and thermal diffusivity in the calorimeter measurement chamber, and the effective thermal diffusivity of the heat flux sensor. These effective properties and the exact solution are used to predict the heat flux sensor voltage readings at thermal equilibrium. Thermal equilibrium predictions are carried out considering only 20% of the total measurement time required for thermal equilibrium. A comparison of the predicted and experimental thermal equilibrium voltages shows that the average percentage error from 330 data sets is only 0.1%. The data sets used in this study come from calorimeters of different sizes that use different kinds of heat flux sensors. Furthermore, different nuclear material matrices were assayed in the process of generating these data sets. This study shows that the integration of this algorithm into the calorimeter data acquisition software will result in an 80% reduction of measurement time. This reduction results in a significant cutback in operational costs for the calorimetric assay of nuclear materials. (authors)
A First-order Prediction-Correction Algorithm for Time-varying (Constrained) Optimization: Preprint

Energy Technology Data Exchange (ETDEWEB)

Dall-Anese, Emiliano [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Simonetto, Andrea [Universite catholique de Louvain

2017-07-25

This paper focuses on the design of online algorithms based on prediction-correction steps to track the optimal solution of a time-varying constrained problem. Existing prediction-correction methods have been shown to work well for unconstrained convex problems and for settings where obtaining the inverse of the Hessian of the cost function can be computationally affordable. The prediction-correction algorithm proposed in this paper addresses the limitations of existing methods by tackling constrained problems and by designing a first-order prediction step that relies on the Hessian of the cost function (and do not require the computation of its inverse). Analytical results are established to quantify the tracking error. Numerical simulations corroborate the analytical results and showcase performance and benefits of the algorithms.

Comparison of four Adaboost algorithm based artificial neural networks in wind speed predictions

International Nuclear Information System (INIS)

Liu, Hui; Tian, Hong-qi; Li, Yan-fei; Zhang, Lei

2015-01-01

Highlights: • Four hybrid algorithms are proposed for the wind speed decomposition. • Adaboost algorithm is adopted to provide a hybrid training framework. • MLP neural networks are built to do the forecasting computation. • Four important network training algorithms are included in the MLP networks. • All the proposed hybrid algorithms are suitable for the wind speed predictions. - Abstract: The technology of wind speed prediction is important to guarantee the safety of wind power utilization. In this paper, four different hybrid methods are proposed for the high-precision multi-step wind speed predictions based on the Adaboost (Adaptive Boosting) algorithm and the MLP (Multilayer Perceptron) neural networks. In the hybrid Adaboost–MLP forecasting architecture, four important algorithms are adopted for the training and modeling of the MLP neural networks, including GD-ALR-BP algorithm, GDM-ALR-BP algorithm, CG-BP-FR algorithm and BFGS algorithm. The aim of the study is to investigate the promoted forecasting percentages of the MLP neural networks by the Adaboost algorithm’ optimization under various training algorithms. The hybrid models in the performance comparison include Adaboost–GD-ALR-BP–MLP, Adaboost–GDM-ALR-BP–MLP, Adaboost–CG-BP-FR–MLP, Adaboost–BFGS–MLP, GD-ALR-BP–MLP, GDM-ALR-BP–MLP, CG-BP-FR–MLP and BFGS–MLP. Two experimental results show that: (1) the proposed hybrid Adaboost–MLP forecasting architecture is effective for the wind speed predictions; (2) the Adaboost algorithm has promoted the forecasting performance of the MLP neural networks considerably; (3) among the proposed Adaboost–MLP forecasting models, the Adaboost–CG-BP-FR–MLP model has the best performance; and (4) the improved percentages of the MLP neural networks by the Adaboost algorithm decrease step by step with the following sequence of training algorithms as: GD-ALR-BP, GDM-ALR-BP, CG-BP-FR and BFGS
Chaos Time Series Prediction Based on Membrane Optimization Algorithms

Directory of Open Access Journals (Sweden)

Meng Li

2015-01-01

Full Text Available This paper puts forward a prediction model based on membrane computing optimization algorithm for chaos time series; the model optimizes simultaneously the parameters of phase space reconstruction (τ,m and least squares support vector machine (LS-SVM (γ,σ by using membrane computing optimization algorithm. It is an important basis for spectrum management to predict accurately the change trend of parameters in the electromagnetic environment, which can help decision makers to adopt an optimal action. Then, the model presented in this paper is used to forecast band occupancy rate of frequency modulation (FM broadcasting band and interphone band. To show the applicability and superiority of the proposed model, this paper will compare the forecast model presented in it with conventional similar models. The experimental results show that whether single-step prediction or multistep prediction, the proposed model performs best based on three error measures, namely, normalized mean square error (NMSE, root mean square error (RMSE, and mean absolute percentage error (MAPE.
Reliability-Based Design Optimization of Trusses with Linked-Discrete Design Variables using the Improved Firefly Algorithm

Directory of Open Access Journals (Sweden)

N. M. Okasha

2016-04-01

Full Text Available In this paper, an approach for conducting a Reliability-Based Design Optimization (RBDO of truss structures with linked-discrete design variables is proposed. The sections of the truss members are selected from the AISC standard tables and thus the design variables that represent the properties of each section are linked. Latin hypercube sampling is used in the evaluation of the structural reliability. The improved firefly algorithm is used for the optimization solution process. It was found that in order to use the improved firefly algorithm for efficiently solving problems of reliability-based design optimization with linked-discrete design variables; it needs to be modified as proposed in this paper to accelerate its convergence.
Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

Science.gov (United States)

Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

2017-05-25

Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.
Increasing Prediction the Original Final Year Project of Student Using Genetic Algorithm

Science.gov (United States)

Saragih, Rijois Iboy Erwin; Turnip, Mardi; Sitanggang, Delima; Aritonang, Mendarissan; Harianja, Eva

2018-04-01

Final year project is very important forgraduation study of a student. Unfortunately, many students are not seriouslydidtheir final projects. Many of studentsask for someone to do it for them. In this paper, an application of genetic algorithms to predict the original final year project of a studentis proposed. In the simulation, the data of the final project for the last 5 years is collected. The genetic algorithm has several operators namely population, selection, crossover, and mutation. The result suggest that genetic algorithm can do better prediction than other comparable model. Experimental results of predicting showed that 70% was more accurate than the previous researched.
Constrained Active Learning for Anchor Link Prediction Across Multiple Heterogeneous Social Networks.

Science.gov (United States)

Zhu, Junxing; Zhang, Jiawei; Wu, Quanyuan; Jia, Yan; Zhou, Bin; Wei, Xiaokai; Yu, Philip S

2017-08-03

Nowadays, people are usually involved in multiple heterogeneous social networks simultaneously. Discovering the anchor links between the accounts owned by the same users across different social networks is crucial for many important inter-network applications, e.g., cross-network link transfer and cross-network recommendation. Many different supervised models have been proposed to predict anchor links so far, but they are effective only when the labeled anchor links are abundant. However, in real scenarios, such a requirement can hardly be met and most anchor links are unlabeled, since manually labeling the inter-network anchor links is quite costly and tedious. To overcome such a problem and utilize the numerous unlabeled anchor links in model building, in this paper, we introduce the active learning based anchor link prediction problem. Different from the traditional active learning problems, due to the one-to-one constraint on anchor links, if an unlabeled anchor link a = ( u , v ) is identified as positive (i.e., existing), all the other unlabeled anchor links incident to account u or account v will be negative (i.e., non-existing) automatically. Viewed in such a perspective, asking for the labels of potential positive anchor links in the unlabeled set will be rewarding in the active anchor link prediction problem. Various novel anchor link information gain measures are defined in this paper, based on which several constraint active anchor link prediction methods are introduced. Extensive experiments have been done on real-world social network datasets to compare the performance of these methods with state-of-art anchor link prediction methods. The experimental results show that the proposed Mean-entropy-based Constrained Active Learning (MC) method can outperform other methods with significant advantages.
Exploring the significance of human mobility patterns in social link prediction

KAUST Repository

Alharbi, Basma Mohammed

2014-01-01

Link prediction is a fundamental task in social networks. Recently, emphasis has been placed on forecasting new social ties using user mobility patterns, e.g., investigating physical and semantic co-locations for new proximity measure. This paper explores the effect of in-depth mobility patterns. Specifically, we study individuals\\' movement behavior, and quantify mobility on the basis of trip frequency, travel purpose and transportation mode. Our hybrid link prediction model is composed of two modules. The first module extracts mobility patterns, including travel purpose and mode, from raw trajectory data. The second module employs the extracted patterns for link prediction. We evaluate our method on two real data sets, GeoLife [15] and Reality Mining [5]. Experimental results show that our hybrid model significantly improves the accuracy of social link prediction, when comparing to primary topology-based solutions. Copyright 2014 ACM.
Investigation of a breathing surrogate prediction algorithm for prospective pulmonary gating

International Nuclear Information System (INIS)

White, Benjamin M.; Low, Daniel A.; Zhao Tianyu; Wuenschel, Sara; Lu, Wei; Lamb, James M.; Mutic, Sasa; Bradley, Jeffrey D.; El Naqa, Issam

2011-01-01

Purpose: A major challenge of four dimensional computed tomography (4DCT) in treatment planning and delivery has been the lack of respiration amplitude and phase reproducibility during image acquisition. The implementation of a prospective gating algorithm would ensure that images would be acquired only during user-specified breathing phases. This study describes the development and testing of an autoregressive moving average (ARMA) model for human respiratory phase prediction under quiet respiration conditions. Methods: A total of 47 4DCT patient datasets and synchronized respiration records was utilized in this study. Three datasets were used in model development and were removed from further evaluation of the ARMA model. The remaining 44 patient datasets were evaluated with the ARMA model for prediction time steps from 50 to 1000 ms in increments of 50 and 100 ms. Thirty-five of these datasets were further used to provide a comparison between the proposed ARMA model and a commercial algorithm with a prediction time step of 240 ms. Results: The optimal number of parameters for the ARMA model was based on three datasets reserved for model development. Prediction error was found to increase as the prediction time step increased. The minimum prediction time step required for prospective gating was selected to be half of the gantry rotation period. The maximum prediction time step with a conservative 95% confidence criterion was found to be 0.3 s. The ARMA model predicted peak inhalation and peak exhalation phases significantly better than the commercial algorithm. Furthermore, the commercial algorithm had numerous instances of missed breath cycles and falsely predicted breath cycles, while the proposed model did not have these errors. Conclusions: An ARMA model has been successfully applied to predict human respiratory phase occurrence. For a typical CT scanner gantry rotation period of 0.4 s (0.2 s prediction time step), the absolute error was relatively small, 0
The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion.

Science.gov (United States)

Borkowski, Piotr

2017-06-20

It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship's current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships.
The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion

Directory of Open Access Journals (Sweden)

Piotr Borkowski

2017-06-01

Full Text Available It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship’s current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships.
Analysis of energy-based algorithms for RNA secondary structure prediction

Directory of Open Access Journals (Sweden)

Hajiaghayi Monir

2012-02-01

Full Text Available Abstract Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA or pseudo-expected accuracy (pseudo-MEA methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms
Predicting Coastal Flood Severity using Random Forest Algorithm

Science.gov (United States)

Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.

2017-12-01

Coastal floods have become more common recently and are predicted to further increase in frequency and severity due to sea level rise. Predicting floods in coastal cities can be difficult due to the number of environmental and geographic factors which can influence flooding events. Built stormwater infrastructure and irregular urban landscapes add further complexity. This paper demonstrates the use of machine learning algorithms in predicting street flood occurrence in an urban coastal setting. The model is trained and evaluated using data from Norfolk, Virginia USA from September 2010 - October 2016. Rainfall, tide levels, water table levels, and wind conditions are used as input variables. Street flooding reports made by city workers after named and unnamed storm events, ranging from 1-159 reports per event, are the model output. Results show that Random Forest provides predictive power in estimating the number of flood occurrences given a set of environmental conditions with an out-of-bag root mean squared error of 4.3 flood reports and a mean absolute error of 0.82 flood reports. The Random Forest algorithm performed much better than Poisson regression. From the Random Forest model, total daily rainfall was by far the most important factor in flood occurrence prediction, followed by daily low tide and daily higher high tide. The model demonstrated here could be used to predict flood severity based on forecast rainfall and tide conditions and could be further enhanced using more complete street flooding data for model training.
A Framing Link Based Tabu Search Algorithm for Large-Scale Multidepot Vehicle Routing Problems

Directory of Open Access Journals (Sweden)

Xuhao Zhang

2014-01-01

Full Text Available A framing link (FL based tabu search algorithm is proposed in this paper for a large-scale multidepot vehicle routing problem (LSMDVRP. Framing links are generated during continuous great optimization of current solutions and then taken as skeletons so as to improve optimal seeking ability, speed up the process of optimization, and obtain better results. Based on the comparison between pre- and postmutation routes in the current solution, different parts are extracted. In the current optimization period, links involved in the optimal solution are regarded as candidates to the FL base. Multiple optimization periods exist in the whole algorithm, and there are several potential FLs in each period. If the update condition is satisfied, the FL base is updated, new FLs are added into the current route, and the next period starts. Through adjusting the borderline of multidepot sharing area with dynamic parameters, the authors define candidate selection principles for three kinds of customer connections, respectively. Link split and the roulette approach are employed to choose FLs. 18 LSMDVRP instances in three groups are studied and new optimal solution values for nine of them are obtained, with higher computation speed and reliability.
Link prediction based on nonequilibrium cooperation effect

Science.gov (United States)

Li, Lanxi; Zhu, Xuzhen; Tian, Hui

2018-04-01

Link prediction in complex networks has become a common focus of many researchers. But most existing methods concentrate on neighbors, and rarely consider degree heterogeneity of two endpoints. Node degree represents the importance or status of endpoints. We describe the large-degree heterogeneity as the nonequilibrium between nodes. This nonequilibrium facilitates a stable cooperation between endpoints, so that two endpoints with large-degree heterogeneity tend to connect stably. We name such a phenomenon as the nonequilibrium cooperation effect. Therefore, this paper proposes a link prediction method based on the nonequilibrium cooperation effect to improve accuracy. Theoretical analysis will be processed in advance, and at the end, experiments will be performed in 12 real-world networks to compare the mainstream methods with our indices in the network through numerical analysis.
Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life

International Nuclear Information System (INIS)

Hu Chao; Youn, Byeng D.; Wang Pingfeng; Taek Yoon, Joung

2012-01-01

Prognostics aims at determining whether a failure of an engineered system (e.g., a nuclear power plant) is impending and estimating the remaining useful life (RUL) before the failure occurs. The traditional data-driven prognostic approach is to construct multiple candidate algorithms using a training data set, evaluate their respective performance using a testing data set, and select the one with the best performance while discarding all the others. This approach has three shortcomings: (i) the selected standalone algorithm may not be robust; (ii) it wastes the resources for constructing the algorithms that are discarded; (iii) it requires the testing data in addition to the training data. To overcome these drawbacks, this paper proposes an ensemble data-driven prognostic approach which combines multiple member algorithms with a weighted-sum formulation. Three weighting schemes, namely the accuracy-based weighting, diversity-based weighting and optimization-based weighting, are proposed to determine the weights of member algorithms. The k-fold cross validation (CV) is employed to estimate the prediction error required by the weighting schemes. The results obtained from three case studies suggest that the ensemble approach with any weighting scheme gives more accurate RUL predictions compared to any sole algorithm when member algorithms producing diverse RUL predictions have comparable prediction accuracy and that the optimization-based weighting scheme gives the best overall performance among the three weighting schemes.
Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.

Science.gov (United States)

Crichton, Gamal; Guo, Yufan; Pyysalo, Sampo; Korhonen, Anna

2018-05-21

Link prediction in biomedical graphs has several important applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor on random- and time-sliced biomedical graphs of real-world sizes (∼ 6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics. In random- and time-sliced experiments when the neural network methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼ 15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼ 0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them. Our results indicate
TaDb: A time-aware diffusion-based recommender algorithm

Science.gov (United States)

Li, Wen-Jun; Xu, Yuan-Yuan; Dong, Qiang; Zhou, Jun-Lin; Fu, Yan

2015-02-01

Traditional recommender algorithms usually employ the early and recent records indiscriminately, which overlooks the change of user interests over time. In this paper, we show that the interests of a user remain stable in a short-term interval and drift during a long-term period. Based on this observation, we propose a time-aware diffusion-based (TaDb) recommender algorithm, which assigns different temporal weights to the leading links existing before the target user's collection and the following links appearing after that in the diffusion process. Experiments on four real datasets, Netflix, MovieLens, FriendFeed and Delicious show that TaDb algorithm significantly improves the prediction accuracy compared with the algorithms not considering temporal effects.
Testing earthquake prediction algorithms: Statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992-1997

Science.gov (United States)

Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.

1999-01-01

Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
Appropriate Combination of Artificial Intelligence and Algorithms for Increasing Predictive Accuracy Management

Directory of Open Access Journals (Sweden)

Shahram Gilani Nia

2010-03-01

Full Text Available In this paper a simple and effective expert system to predict random data fluctuation in short-term period is established. Evaluation process includes introducing Fourier series, Markov chain model prediction and comparison (Gray combined with the model prediction Gray- Fourier- Markov that the mixed results, to create an expert system predicted with artificial intelligence, made this model to predict the effectiveness of random fluctuation in most data management programs to increase. The outcome of this study introduced artificial intelligence algorithms that help detect that the computer environment to create a system that experts predict the short-term and unstable situation happens correctly and accurately predict. To test the effectiveness of the algorithm presented studies (Chen Tzay len,2008, and predicted data of tourism demand for Iran model is used. Results for the two countries show output model has high accuracy.
Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms

DEFF Research Database (Denmark)

Nielsen, Henrik

2015-01-01

was used. Various statistical and machine learning algorithms are used with all three approaches, and various measures and standards are employed when reporting the performances of the developed methods. This chapter presents a number of available methods for prediction of sorting signals and subcellular...

Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis

Directory of Open Access Journals (Sweden)

Charles Frank

2018-03-01

Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.
An Improved User Selection Algorithm in Multiuser MIMO Broadcast with Channel Prediction

Science.gov (United States)

Min, Zhi; Ohtsuki, Tomoaki

In multiuser MIMO-BC (Multiple-Input Multiple-Output Broadcasting) systems, user selection is important to achieve multiuser diversity. The optimal user selection algorithm is to try all the combinations of users to find the user group that can achieve the multiuser diversity. Unfortunately, the high calculation cost of the optimal algorithm prevents its implementation. Thus, instead of the optimal algorithm, some suboptimal user selection algorithms were proposed based on semiorthogonality of user channel vectors. The purpose of this paper is to achieve multiuser diversity with a small amount of calculation. For this purpose, we propose a user selection algorithm that can improve the orthogonality of a selected user group. We also apply a channel prediction technique to a MIMO-BC system to get more accurate channel information at the transmitter. Simulation results show that the channel prediction can improve the accuracy of channel information for user selections, and the proposed user selection algorithm achieves higher sum rate capacity than the SUS (Semiorthogonal User Selection) algorithm. Also we discuss the setting of the algorithm threshold. As the result of a discussion on the calculation complexity, which uses the number of complex multiplications as the parameter, the proposed algorithm is shown to have a calculation complexity almost equal to that of the SUS algorithm, and they are much lower than that of the optimal user selection algorithm.
Predictive Power Estimation Algorithm (PPEA--a new algorithm to reduce overfitting for genomic biomarker discovery.

Directory of Open Access Journals (Sweden)

Jiangang Liu

Full Text Available Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA, which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1 PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2 the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3 using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4 more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses.
RAINLINK: Retrieval algorithm for rainfall monitoring employing microwave links from a cellular communication network

Science.gov (United States)

Uijlenhoet, R.; Overeem, A.; Leijnse, H.; Rios Gaona, M. F.

2017-12-01

The basic principle of rainfall estimation using microwave links is as follows. Rainfall attenuates the electromagnetic signals transmitted from one telephone tower to another. By measuring the received power at one end of a microwave link as a function of time, the path-integrated attenuation due to rainfall can be calculated, which can be converted to average rainfall intensities over the length of a link. Microwave links from cellular communication networks have been proposed as a promising new rainfall measurement technique for one decade. They are particularly interesting for those countries where few surface rainfall observations are available. Yet to date no operational (real-time) link-based rainfall products are available. To advance the process towards operational application and upscaling of this technique, there is a need for freely available, user-friendly computer code for microwave link data processing and rainfall mapping. Such software is now available as R package "RAINLINK" on GitHub (https://github.com/overeem11/RAINLINK). It contains a working example to compute link-based 15-min rainfall maps for the entire surface area of The Netherlands for 40 hours from real microwave link data. This is a working example using actual data from an extensive network of commercial microwave links, for the first time, which will allow users to test their own algorithms and compare their results with ours. The package consists of modular functions, which facilitates running only part of the algorithm. The main processings steps are: 1) Preprocessing of link data (initial quality and consistency checks); 2) Wet-dry classification using link data; 3) Reference signal determination; 4) Removal of outliers ; 5) Correction of received signal powers; 6) Computation of mean path-averaged rainfall intensities; 7) Interpolation of rainfall intensities ; 8) Rainfall map visualisation. Some applications of RAINLINK will be shown based on microwave link data from a
Prediction of Baseflow Index of Catchments using Machine Learning Algorithms

Science.gov (United States)

Yadav, B.; Hatfield, K.

2017-12-01

We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized
Incorporating functional inter-relationships into protein function prediction algorithms

Directory of Open Access Journals (Sweden)

Kumar Vipin

2009-05-01

Full Text Available Abstract Background Functional classification schemes (e.g. the Gene Ontology that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. Results We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the k-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of
A unified algorithm for predicting partition coefficients for PBPK modeling of drugs and environmental chemicals

International Nuclear Information System (INIS)

Peyret, Thomas; Poulin, Patrick; Krishnan, Kannan

2010-01-01

The algorithms in the literature focusing to predict tissue:blood PC (P tb ) for environmental chemicals and tissue:plasma PC based on total (K p ) or unbound concentration (K pu ) for drugs differ in their consideration of binding to hemoglobin, plasma proteins and charged phospholipids. The objective of the present study was to develop a unified algorithm such that P tb , K p and K pu for both drugs and environmental chemicals could be predicted. The development of the unified algorithm was accomplished by integrating all mechanistic algorithms previously published to compute the PCs. Furthermore, the algorithm was structured in such a way as to facilitate predictions of the distribution of organic compounds at the macro (i.e. whole tissue) and micro (i.e. cells and fluids) levels. The resulting unified algorithm was applied to compute the rat P tb , K p or K pu of muscle (n = 174), liver (n = 139) and adipose tissue (n = 141) for acidic, neutral, zwitterionic and basic drugs as well as ketones, acetate esters, alcohols, aliphatic hydrocarbons, aromatic hydrocarbons and ethers. The unified algorithm reproduced adequately the values predicted previously by the published algorithms for a total of 142 drugs and chemicals. The sensitivity analysis demonstrated the relative importance of the various compound properties reflective of specific mechanistic determinants relevant to prediction of PC values of drugs and environmental chemicals. Overall, the present unified algorithm uniquely facilitates the computation of macro and micro level PCs for developing organ and cellular-level PBPK models for both chemicals and drugs.
Open-source chemogenomic data-driven algorithms for predicting drug-target interactions.

Science.gov (United States)

Hao, Ming; Bryant, Stephen H; Wang, Yanli

2018-02-06

While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred. Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US.
A prediction algorithm for first onset of major depression in the general population: development and validation.

Science.gov (United States)

Wang, JianLi; Sareen, Jitender; Patten, Scott; Bolton, James; Schmitz, Norbert; Birney, Arden

2014-05-01

Prediction algorithms are useful for making clinical decisions and for population health planning. However, such prediction algorithms for first onset of major depression do not exist. The objective of this study was to develop and validate a prediction algorithm for first onset of major depression in the general population. Longitudinal study design with approximate 3-year follow-up. The study was based on data from a nationally representative sample of the US general population. A total of 28 059 individuals who participated in Waves 1 and 2 of the US National Epidemiologic Survey on Alcohol and Related Conditions and who had not had major depression at Wave 1 were included. The prediction algorithm was developed using logistic regression modelling in 21 813 participants from three census regions. The algorithm was validated in participants from the 4th census region (n=6246). Major depression occurred since Wave 1 of the National Epidemiologic Survey on Alcohol and Related Conditions, assessed by the Alcohol Use Disorder and Associated Disabilities Interview Schedule-diagnostic and statistical manual for mental disorders IV. A prediction algorithm containing 17 unique risk factors was developed. The algorithm had good discriminative power (C statistics=0.7538, 95% CI 0.7378 to 0.7699) and excellent calibration (F-adjusted test=1.00, p=0.448) with the weighted data. In the validation sample, the algorithm had a C statistic of 0.7259 and excellent calibration (Hosmer-Lemeshow χ(2)=3.41, p=0.906). The developed prediction algorithm has good discrimination and calibration capacity. It can be used by clinicians, mental health policy-makers and service planners and the general public to predict future risk of having major depression. The application of the algorithm may lead to increased personalisation of treatment, better clinical decisions and more optimal mental health service planning.
LP-LPA: A link influence-based label propagation algorithm for discovering community structures in networks

Science.gov (United States)

Berahmand, Kamal; Bouyer, Asgarali

2018-03-01

Community detection is an essential approach for analyzing the structural and functional properties of complex networks. Although many community detection algorithms have been recently presented, most of them are weak and limited in different ways. Label Propagation Algorithm (LPA) is a well-known and efficient community detection technique which is characterized by the merits of nearly-linear running time and easy implementation. However, LPA has some significant problems such as instability, randomness, and monster community detection. In this paper, an algorithm, namely node’s label influence policy for label propagation algorithm (LP-LPA) was proposed for detecting efficient community structures. LP-LPA measures link strength value for edges and nodes’ label influence value for nodes in a new label propagation strategy with preference on link strength and for initial nodes selection, avoid of random behavior in tiebreak states, and efficient updating order and rule update. These procedures can sort out the randomness issue in an original LPA and stabilize the discovered communities in all runs of the same network. Experiments on synthetic networks and a wide range of real-world social networks indicated that the proposed method achieves significant accuracy and high stability. Indeed, it can obviously solve monster community problem with regard to detecting communities in networks.
A Novel Grey Prediction Model Combining Markov Chain with Functional-Link Net and Its Application to Foreign Tourist Forecasting

Directory of Open Access Journals (Sweden)

Yi-Chung Hu

2017-10-01

Full Text Available Grey prediction models for time series have been widely applied to demand forecasting because only limited data are required for them to build a time series model without any statistical assumptions. Previous studies have demonstrated that the combination of grey prediction with neural networks helps grey prediction perform better. Some methods have been presented to improve the prediction accuracy of the popular GM(1,1 model by using the Markov chain to estimate the residual needed to modify a predicted value. Compared to the previous Grey-Markov models, this study contributes to apply the functional-link net to estimate the degree to which a predicted value obtained from the GM(1,1 model can be adjusted. Furthermore, the troublesome number of states and their bounds that are not easily specified in Markov chain have been determined by a genetic algorithm. To verify prediction performance, the proposed grey prediction model was applied to an important grey system problem—foreign tourist forecasting. Experimental results show that the proposed model provides satisfactory results compared to the other Grey-Markov models considered.
Research on wind field algorithm of wind lidar based on BP neural network and grey prediction

Science.gov (United States)

Chen, Yong; Chen, Chun-Li; Luo, Xiong; Zhang, Yan; Yang, Ze-hou; Zhou, Jie; Shi, Xiao-ding; Wang, Lei

2018-01-01

This paper uses the BP neural network and grey algorithm to forecast and study radar wind field. In order to reduce the residual error in the wind field prediction which uses BP neural network and grey algorithm, calculating the minimum value of residual error function, adopting the residuals of the gray algorithm trained by BP neural network, using the trained network model to forecast the residual sequence, using the predicted residual error sequence to modify the forecast sequence of the grey algorithm. The test data show that using the grey algorithm modified by BP neural network can effectively reduce the residual value and improve the prediction precision.
Small angle X-ray scattering and cross-linking for data assisted protein structure prediction in CASP 12 with prospects for improved accuracy

KAUST Repository

Ogorzalek, Tadeusz L.

2018-01-04

Experimental data offers empowering constraints for structure prediction. These constraints can be used to filter equivalently scored models or more powerfully within optimization functions toward prediction. In CASP12, Small Angle X-ray Scattering (SAXS) and Cross-Linking Mass Spectrometry (CLMS) data, measured on an exemplary set of novel fold targets, were provided to the CASP community of protein structure predictors. As HT, solution-based techniques, SAXS and CLMS can efficiently measure states of the full-length sequence in its native solution conformation and assembly. However, this experimental data did not substantially improve prediction accuracy judged by fits to crystallographic models. One issue, beyond intrinsic limitations of the algorithms, was a disconnect between crystal structures and solution-based measurements. Our analyses show that many targets had substantial percentages of disordered regions (up to 40%) or were multimeric or both. Thus, solution measurements of flexibility and assembly support variations that may confound prediction algorithms trained on crystallographic data and expecting globular fully-folded monomeric proteins. Here, we consider the CLMS and SAXS data collected, the information in these solution measurements, and the challenges in incorporating them into computational prediction. As improvement opportunities were only partly realized in CASP12, we provide guidance on how data from the full-length biological unit and the solution state can better aid prediction of the folded monomer or subunit. We furthermore describe strategic integrations of solution measurements with computational prediction programs with the aim of substantially improving foundational knowledge and the accuracy of computational algorithms for biologically-relevant structure predictions for proteins in solution. This article is protected by copyright. All rights reserved.
Small angle X-ray scattering and cross-linking for data assisted protein structure prediction in CASP 12 with prospects for improved accuracy

KAUST Repository

Ogorzalek, Tadeusz L.; Hura, Greg L.; Belsom, Adam; Burnett, Kathryn H.; Kryshtafovych, Andriy; Tainer, John A.; Rappsilber, Juri; Tsutakawa, Susan E.; Fidelis, Krzysztof

2018-01-01

Experimental data offers empowering constraints for structure prediction. These constraints can be used to filter equivalently scored models or more powerfully within optimization functions toward prediction. In CASP12, Small Angle X-ray Scattering (SAXS) and Cross-Linking Mass Spectrometry (CLMS) data, measured on an exemplary set of novel fold targets, were provided to the CASP community of protein structure predictors. As HT, solution-based techniques, SAXS and CLMS can efficiently measure states of the full-length sequence in its native solution conformation and assembly. However, this experimental data did not substantially improve prediction accuracy judged by fits to crystallographic models. One issue, beyond intrinsic limitations of the algorithms, was a disconnect between crystal structures and solution-based measurements. Our analyses show that many targets had substantial percentages of disordered regions (up to 40%) or were multimeric or both. Thus, solution measurements of flexibility and assembly support variations that may confound prediction algorithms trained on crystallographic data and expecting globular fully-folded monomeric proteins. Here, we consider the CLMS and SAXS data collected, the information in these solution measurements, and the challenges in incorporating them into computational prediction. As improvement opportunities were only partly realized in CASP12, we provide guidance on how data from the full-length biological unit and the solution state can better aid prediction of the folded monomer or subunit. We furthermore describe strategic integrations of solution measurements with computational prediction programs with the aim of substantially improving foundational knowledge and the accuracy of computational algorithms for biologically-relevant structure predictions for proteins in solution. This article is protected by copyright. All rights reserved.
To trade or not to trade: Link prediction in the virtual water network

Science.gov (United States)

Tuninetti, Marta; Tamea, Stefania; Laio, Francesco; Ridolfi, Luca

2017-12-01

In the international trade network, links express the (temporary) presence of a commercial exchange of goods between any two countries. Given the dynamical behaviour of the trade network, where links are created and dismissed every year, predicting the link activation/deactivation is an open research question. Through the international trade network of agricultural goods, water resources are 'virtually' transferred from the country of production to the country of consumption. We propose a novel methodology for link prediction applied to the network of virtual water trade. Starting from the assumption of having links between any two countries, we estimate the associated virtual water flows by means of a gravity-law model using country and link characteristics as drivers. We consider the links with estimated flows higher than 1000 m3/year as active links, while the others as non-active links. Flows traded along estimated active links are then re-estimated using a similar but differently-calibrated gravity-law model. We were able to correctly model 84% of the existing links and 93% of the non-existing links in year 2011. It is worth to note that the predicted active links carry 99% of the global virtual water flow; hence, missed links are mainly those where a minimum volume of virtual water is exchanged. Results indicate that, over the period from 1986 to 2011, population, geographical distances between countries, and agricultural efficiency (through fertilizers use) are the major factors driving the link activation and deactivation. As opposed to other (network-based) models for link prediction, the proposed method is able to reconstruct the network architecture without any prior knowledge of the network topology, using only the nodes and links attributes; it thus represents a general method that can be applied to other networks such as food or value trade networks.
Comparison of the accuracy of three algorithms in predicting accessory pathways among adult Wolff-Parkinson-White syndrome patients.

Science.gov (United States)

Maden, Orhan; Balci, Kevser Gülcihan; Selcuk, Mehmet Timur; Balci, Mustafa Mücahit; Açar, Burak; Unal, Sefa; Kara, Meryem; Selcuk, Hatice

2015-12-01

The aim of this study was to investigate the accuracy of three algorithms in predicting accessory pathway locations in adult patients with Wolff-Parkinson-White syndrome in Turkish population. A total of 207 adult patients with Wolff-Parkinson-White syndrome were retrospectively analyzed. The most preexcited 12-lead electrocardiogram in sinus rhythm was used for analysis. Two investigators blinded to the patient data used three algorithms for prediction of accessory pathway location. Among all locations, 48.5% were left-sided, 44% were right-sided, and 7.5% were located in the midseptum or anteroseptum. When only exact locations were accepted as match, predictive accuracy for Chiang was 71.5%, 72.4% for d'Avila, and 71.5% for Arruda. The percentage of predictive accuracy of all algorithms did not differ between the algorithms (p = 1.000; p = 0.875; p = 0.885, respectively). The best algorithm for prediction of right-sided, left-sided, and anteroseptal and midseptal accessory pathways was Arruda (p algorithms were similar in predicting accessory pathway location and the predicted accuracy was lower than previously reported by their authors. However, according to the accessory pathway site, the algorithm designed by Arruda et al. showed better predictions than the other algorithms and using this algorithm may provide advantages before a planned ablation.
Algorithms for assessing person-based consistency among linked records for the investigation of maternal use of medications and safety

Directory of Open Access Journals (Sweden)

Duong Tran

2017-04-01

Quality assessment indicated high consistency among linked records. The set of algorithms developed in this project can be applied to similar linked perinatal datasets to promote a consistent approach and comparability across studies.
The wind power prediction research based on mind evolutionary algorithm

Science.gov (United States)

Zhuang, Ling; Zhao, Xinjian; Ji, Tianming; Miao, Jingwen; Cui, Haina

2018-04-01

When the wind power is connected to the power grid, its characteristics of fluctuation, intermittent and randomness will affect the stability of the power system. The wind power prediction can guarantee the power quality and reduce the operating cost of power system. There were some limitations in several traditional wind power prediction methods. On the basis, the wind power prediction method based on Mind Evolutionary Algorithm (MEA) is put forward and a prediction model is provided. The experimental results demonstrate that MEA performs efficiently in term of the wind power prediction. The MEA method has broad prospect of engineering application.
RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model.

Science.gov (United States)

Jabbari, Hosna; Wark, Ian; Montemagno, Carlo

2018-01-01

RNA is a biopolymer with various applications inside the cell and in biotechnology. Structure of an RNA molecule mainly determines its function and is essential to guide nanostructure design. Since experimental structure determination is time-consuming and expensive, accurate computational prediction of RNA structure is of great importance. Prediction of RNA secondary structure is relatively simpler than its tertiary structure and provides information about its tertiary structure, therefore, RNA secondary structure prediction has received attention in the past decades. Numerous methods with different folding approaches have been developed for RNA secondary structure prediction. While methods for prediction of RNA pseudoknot-free structure (structures with no crossing base pairs) have greatly improved in terms of their accuracy, methods for prediction of RNA pseudoknotted secondary structure (structures with crossing base pairs) still have room for improvement. A long-standing question for improving the prediction accuracy of RNA pseudoknotted secondary structure is whether to focus on the prediction algorithm or the underlying energy model, as there is a trade-off on computational cost of the prediction algorithm versus the generality of the method. The aim of this work is to argue when comparing different methods for RNA pseudoknotted structure prediction, the combination of algorithm and energy model should be considered and a method should not be considered superior or inferior to others if they do not use the same scoring model. We demonstrate that while the folding approach is important in structure prediction, it is not the only important factor in prediction accuracy of a given method as the underlying energy model is also as of great value. Therefore we encourage researchers to pay particular attention in comparing methods with different energy models.
Non-linear multivariable predictive control of an alcoholic fermentation process using functional link networks

Directory of Open Access Journals (Sweden)

Luiz Augusto da Cruz Meleiro

2005-06-01

Full Text Available In this work a MIMO non-linear predictive controller was developed for an extractive alcoholic fermentation process. The internal model of the controller was represented by two MISO Functional Link Networks (FLNs, identified using simulated data generated from a deterministic mathematical model whose kinetic parameters were determined experimentally. The FLN structure presents as advantages fast training and guaranteed convergence, since the estimation of the weights is a linear optimization problem. Besides, the elimination of non-significant weights generates parsimonious models, which allows for fast execution in an MPC-based algorithm. The proposed algorithm showed good potential in identification and control of non-linear processes.Neste trabalho um controlador preditivo não linear multivariável foi desenvolvido para um processo de fermentação alcoólica extrativa. O modelo interno do controlador foi representado por duas redes do tipo Functional Link (FLN, identificadas usando dados de simulação gerados a partir de um modelo validado experimentalmente. A estrutura FLN apresenta como vantagem o treinamento rápido e convergência garantida, já que a estimação dos seus pesos é um problema de otimização linear. Além disso, a eliminação de pesos não significativos gera modelos parsimoniosos, o que permite a rápida execução em algoritmos de controle preditivo baseado em modelo. Os resultados mostram que o algoritmo proposto tem grande potencial para identificação e controle de processos não lineares.

The Inverse Contagion Problem (ICP) vs.. Predicting site contagion in real time, when network links are not observable

Science.gov (United States)

Mushkin, I.; Solomon, S.

2017-10-01

We study the inverse contagion problem (ICP). As opposed to the direct contagion problem, in which the network structure is known and the question is when each node will be contaminated, in the inverse problem the links of the network are unknown but a sequence of contagion histories (the times when each node was contaminated) is observed. We consider two versions of the ICP: The strong problem (SICP), which is the reconstruction of the network and has been studied before, and the weak problem (WICP), which requires "only" the prediction (at each time step) of the nodes that will be contaminated at the next time step (this is often the real life situation in which a contagion is observed and predictions are made in real time). Moreover, our focus is on analyzing the increasing accuracy of the solution, as a function of the number of contagion histories already observed. For simplicity, we discuss the simplest (deterministic and synchronous) contagion dynamics and the simplest solution algorithm, which we have applied to different network types. The main result of this paper is that the complex problem of the convergence of the ICP for a network can be reduced to an individual property of pairs of nodes: the "false link difficulty". By definition, given a pair of unlinked nodes i and j, the difficulty of the false link (i,j) is the probability that in a random contagion history, the nodes i and j are not contaminated at the same time step (or at consecutive time steps). In other words, the "false link difficulty" of a non-existing network link is the probability that the observations during a random contagion history would not rule out that link. This probability is relatively straightforward to calculate, and in most instances relies only on the relative positions of the two nodes (i,j) and not on the entire network structure. We have observed the distribution of false link difficulty for various network types, estimated it theoretically and confronted it
Predicting disease risk using bootstrap ranking and classification algorithms.

Directory of Open Access Journals (Sweden)

Ohad Manor

Full Text Available Genome-wide association studies (GWAS are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a "black box" in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF, suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment.
Predicting mining activity with parallel genetic algorithms

Science.gov (United States)

Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,

2005-01-01

We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.
Biomine: predicting links between biological entities using network models of heterogeneous databases

Directory of Open Access Journals (Sweden)

Eronen Lauri

2012-06-01

Full Text Available Abstract Background Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Results Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. Conclusions The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable
Diagnosis and prediction of periodontally compromised teeth using a deep learning-based convolutional neural network algorithm.

Science.gov (United States)

Lee, Jae-Hong; Kim, Do-Hyung; Jeong, Seong-Nyum; Choi, Seong-Ho

2018-04-01

The aim of the current study was to develop a computer-assisted detection system based on a deep convolutional neural network (CNN) algorithm and to evaluate the potential usefulness and accuracy of this system for the diagnosis and prediction of periodontally compromised teeth (PCT). Combining pretrained deep CNN architecture and a self-trained network, periapical radiographic images were used to determine the optimal CNN algorithm and weights. The diagnostic and predictive accuracy, sensitivity, specificity, positive predictive value, negative predictive value, receiver operating characteristic (ROC) curve, area under the ROC curve, confusion matrix, and 95% confidence intervals (CIs) were calculated using our deep CNN algorithm, based on a Keras framework in Python. The periapical radiographic dataset was split into training (n=1,044), validation (n=348), and test (n=348) datasets. With the deep learning algorithm, the diagnostic accuracy for PCT was 81.0% for premolars and 76.7% for molars. Using 64 premolars and 64 molars that were clinically diagnosed as severe PCT, the accuracy of predicting extraction was 82.8% (95% CI, 70.1%-91.2%) for premolars and 73.4% (95% CI, 59.9%-84.0%) for molars. We demonstrated that the deep CNN algorithm was useful for assessing the diagnosis and predictability of PCT. Therefore, with further optimization of the PCT dataset and improvements in the algorithm, a computer-aided detection system can be expected to become an effective and efficient method of diagnosing and predicting PCT.
A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

Science.gov (United States)

Manning, Robert M.

1990-01-01

A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.
An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

Science.gov (United States)

Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

2017-12-01

Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.
CrossLink: a novel method for cross-condition classification of cancer subtypes.

Science.gov (United States)

Ma, Chifeng; Sastry, Konduru S; Flore, Mario; Gehani, Salah; Al-Bozom, Issam; Feng, Yusheng; Serpedin, Erchin; Chouchane, Lotfi; Chen, Yidong; Huang, Yufei

2016-08-22

We considered the prediction of cancer classes (e.g. subtypes) using patient gene expression profiles that contain both systematic and condition-specific biases when compared with the training reference dataset. The conventional normalization-based approaches cannot guarantee that the gene signatures in the reference and prediction datasets always have the same distribution for all different conditions as the class-specific gene signatures change with the condition. Therefore, the trained classifier would work well under one condition but not under another. To address the problem of current normalization approaches, we propose a novel algorithm called CrossLink (CL). CL recognizes that there is no universal, condition-independent normalization mapping of signatures. In contrast, it exploits the fact that the signature is unique to its associated class under any condition and thus employs an unsupervised clustering algorithm to discover this unique signature. We assessed the performance of CL for cross-condition predictions of PAM50 subtypes of breast cancer by using a simulated dataset modeled after TCGA BRCA tumor samples with a cross-validation scheme, and datasets with known and unknown PAM50 classification. CL achieved prediction accuracy >73 %, highest among other methods we evaluated. We also applied the algorithm to a set of breast cancer tumors derived from Arabic population to assign a PAM50 classification to each tumor based on their gene expression profiles. A novel algorithm CrossLink for cross-condition prediction of cancer classes was proposed. In all test datasets, CL showed robust and consistent improvement in prediction performance over other state-of-the-art normalization and classification algorithms.
A comprehensive performance evaluation on the prediction results of existing cooperative transcription factors identification algorithms.

Science.gov (United States)

Lai, Fu-Jou; Chang, Hong-Tsun; Huang, Yueh-Min; Wu, Wei-Sheng

2014-01-01

Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new
Using linked electronic data to validate algorithms for health outcomes in administrative databases.

Science.gov (United States)

Lee, Wan-Ju; Lee, Todd A; Pickard, Alan Simon; Shoaibi, Azadeh; Schumock, Glen T

2015-08-01

The validity of algorithms used to identify health outcomes in claims-based and administrative data is critical to the reliability of findings from observational studies. The traditional approach to algorithm validation, using medical charts, is expensive and time-consuming. An alternative method is to link the claims data to an external, electronic data source that contains information allowing confirmation of the event of interest. In this paper, we describe this external linkage validation method and delineate important considerations to assess the feasibility and appropriateness of validating health outcomes using this approach. This framework can help investigators decide whether to pursue an external linkage validation method for identifying health outcomes in administrative/claims data.
Forecasting pulsatory motion for non-invasive cardiac radiosurgery: an analysis of algorithms from respiratory motion prediction.

Science.gov (United States)

Ernst, Floris; Bruder, Ralf; Schlaefer, Alexander; Schweikard, Achim

2011-01-01

Recently, radiosurgical treatment of cardiac arrhythmia, especially atrial fibrillation, has been proposed. Using the CyberKnife, focussed radiation will be used to create ablation lines on the beating heart to block unwanted electrical activity. Since this procedure requires high accuracy, the inevitable latency of the system (i.e., the robotic manipulator following the motion of the heart) has to be compensated for. We examine the applicability of prediction algorithms developed for respiratory motion prediction to the prediction of pulsatory motion. We evaluated the MULIN, nLMS, wLMS, SVRpred and EKF algorithms. The test data used has been recorded using external infrared position sensors, 3D ultrasound and the NavX catheter systems. With this data, we have shown that the error from latency can be reduced by at least 10 and as much as 75% (44% average), depending on the type of signal. It has also been shown that, although the SVRpred algorithm was successful in most cases, it was outperformed by the simple nLMS algorithm, the EKF or the wLMS algorithm in a number of cases. We have shown that prediction of cardiac motion is possible and that the algorithms known from respiratory motion prediction are applicable. Since pulsation is more regular than respiration, more research will have to be done to improve frequency-tracking algorithms, like the EKF method, which performed better than expected from their behaviour on respiratory motion traces.
Crius: A Novel Fragment-Based Algorithm of De Novo Substrate Prediction for Enzymes.

Science.gov (United States)

Yao, Zhiqiang; Jiang, Shuiqin; Zhang, Lujia; Gao, Bei; He, Xiao; Zhang, John Z H; Wei, Dongzhi

2018-05-03

The study of enzyme substrate specificity is vital for developing potential applications of enzymes. However, the routine experimental procedures require lot of resources in the discovery of novel substrates. This article reports an in silico structure-based algorithm called Crius, which predicts substrates for enzyme. The results of this fragment-based algorithm show good agreements between the simulated and experimental substrate specificities, using a lipase from Candida antarctica (CALB), a nitrilase from Cyanobacterium syechocystis sp. PCC6803 (Nit6803), and an aldo-keto reductase from Gluconobacter oxydans (Gox0644). This opens new prospects of developing computer algorithms that can effectively predict substrates for an enzyme. This article is protected by copyright. All rights reserved. © 2018 The Protein Society.
An Interference-Aware Traffic-Priority-Based Link Scheduling Algorithm for Interference Mitigation in Multiple Wireless Body Area Networks

Directory of Open Access Journals (Sweden)

Thien T. T. Le

2016-12-01

Full Text Available Currently, wireless body area networks (WBANs are effectively used for health monitoring services. However, in cases where WBANs are densely deployed, interference among WBANs can cause serious degradation of network performance and reliability. Inter-WBAN interference can be reduced by scheduling the communication links of interfering WBANs. In this paper, we propose an interference-aware traffic-priority-based link scheduling (ITLS algorithm to overcome inter-WBAN interference in densely deployed WBANs. First, we model a network with multiple WBANs as an interference graph where node-level interference and traffic priority are taken into account. Second, we formulate link scheduling for multiple WBANs as an optimization model where the objective is to maximize the throughput of the entire network while ensuring the traffic priority of sensor nodes. Finally, we propose the ITLS algorithm for multiple WBANs on the basis of the optimization model. High spatial reuse is also achieved in the proposed ITLS algorithm. The proposed ITLS achieves high spatial reuse while considering traffic priority, packet length, and the number of interfered sensor nodes. Our simulation results show that the proposed ITLS significantly increases spatial reuse and network throughput with lower delay by mitigating inter-WBAN interference.
Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

Science.gov (United States)

Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

2015-01-01

Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (pmachine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273
Investigation of energy management strategies for photovoltaic systems - A predictive control algorithm

Science.gov (United States)

Cull, R. C.; Eltimsahy, A. H.

1983-01-01

The present investigation is concerned with the formulation of energy management strategies for stand-alone photovoltaic (PV) systems, taking into account a basic control algorithm for a possible predictive, (and adaptive) controller. The control system controls the flow of energy in the system according to the amount of energy available, and predicts the appropriate control set-points based on the energy (insolation) available by using an appropriate system model. Aspects of adaptation to the conditions of the system are also considered. Attention is given to a statistical analysis technique, the analysis inputs, the analysis procedure, and details regarding the basic control algorithm.
Accuracy of algorithms to predict accessory pathway location in children with Wolff-Parkinson-White syndrome.

Science.gov (United States)

Wren, Christopher; Vogel, Melanie; Lord, Stephen; Abrams, Dominic; Bourke, John; Rees, Philip; Rosenthal, Eric

2012-02-01

The aim of this study was to examine the accuracy in predicting pathway location in children with Wolff-Parkinson-White syndrome for each of seven published algorithms. ECGs from 100 consecutive children with Wolff-Parkinson-White syndrome undergoing electrophysiological study were analysed by six investigators using seven published algorithms, six of which had been developed in adult patients. Accuracy and concordance of predictions were adjusted for the number of pathway locations. Accessory pathways were left-sided in 49, septal in 20 and right-sided in 31 children. Overall accuracy of prediction was 30-49% for the exact location and 61-68% including adjacent locations. Concordance between investigators varied between 41% and 86%. No algorithm was better at predicting septal pathways (accuracy 5-35%, improving to 40-78% including adjacent locations), but one was significantly worse. Predictive accuracy was 24-53% for the exact location of right-sided pathways (50-71% including adjacent locations) and 32-55% for the exact location of left-sided pathways (58-73% including adjacent locations). All algorithms were less accurate in our hands than in other authors' own assessment. None performed well in identifying midseptal or right anteroseptal accessory pathway locations.
SU-C-BRF-07: A Pattern Fusion Algorithm for Multi-Step Ahead Prediction of Surrogate Motion

International Nuclear Information System (INIS)

Zawisza, I; Yan, H; Yin, F

2014-01-01

Purpose: To assure that tumor motion is within the radiation field during high-dose and high-precision radiosurgery, real-time imaging and surrogate monitoring are employed. These methods are useful in providing real-time tumor/surrogate motion but no future information is available. In order to anticipate future tumor/surrogate motion and track target location precisely, an algorithm is developed and investigated for estimating surrogate motion multiple-steps ahead. Methods: The study utilized a one-dimensional surrogate motion signal divided into three components: (a) training component containing the primary data including the first frame to the beginning of the input subsequence; (b) input subsequence component of the surrogate signal used as input to the prediction algorithm: (c) output subsequence component is the remaining signal used as the known output of the prediction algorithm for validation. The prediction algorithm consists of three major steps: (1) extracting subsequences from training component which best-match the input subsequence according to given criterion; (2) calculating weighting factors from these best-matched subsequence; (3) collecting the proceeding parts of the subsequences and combining them together with assigned weighting factors to form output. The prediction algorithm was examined for several patients, and its performance is assessed based on the correlation between prediction and known output. Results: Respiratory motion data was collected for 20 patients using the RPM system. The output subsequence is the last 50 samples (∼2 seconds) of a surrogate signal, and the input subsequence was 100 (∼3 seconds) frames prior to the output subsequence. Based on the analysis of correlation coefficient between predicted and known output subsequence, the average correlation is 0.9644±0.0394 and 0.9789±0.0239 for equal-weighting and relative-weighting strategies, respectively. Conclusion: Preliminary results indicate that the prediction
Dynamically Predicting the Quality of Service: Batch, Online, and Hybrid Algorithms

Directory of Open Access Journals (Sweden)

Ya Chen

2017-01-01

Full Text Available This paper studies the problem of dynamically modeling the quality of web service. The philosophy of designing practical web service recommender systems is delivered in this paper. A general system architecture for such systems continuously collects the user-service invocation records and includes both an online training module and an offline training module for quality prediction. In addition, we introduce matrix factorization-based online and offline training algorithms based on the gradient descent algorithms and demonstrate the fitness of this online/offline algorithm framework to the proposed architecture. The superiority of the proposed model is confirmed by empirical studies on a real-life quality of web service data set and comparisons with existing web service recommendation algorithms.
Influenza detection and prediction algorithms: comparative accuracy trial in Östergötland county, Sweden, 2008-2012.

Science.gov (United States)

Spreco, A; Eriksson, O; Dahlström, Ö; Timpka, T

2017-07-01

Methods for the detection of influenza epidemics and prediction of their progress have seldom been comparatively evaluated using prospective designs. This study aimed to perform a prospective comparative trial of algorithms for the detection and prediction of increased local influenza activity. Data on clinical influenza diagnoses recorded by physicians and syndromic data from a telenursing service were used. Five detection and three prediction algorithms previously evaluated in public health settings were calibrated and then evaluated over 3 years. When applied on diagnostic data, only detection using the Serfling regression method and prediction using the non-adaptive log-linear regression method showed acceptable performances during winter influenza seasons. For the syndromic data, none of the detection algorithms displayed a satisfactory performance, while non-adaptive log-linear regression was the best performing prediction method. We conclude that evidence was found for that available algorithms for influenza detection and prediction display satisfactory performance when applied on local diagnostic data during winter influenza seasons. When applied on local syndromic data, the evaluated algorithms did not display consistent performance. Further evaluations and research on combination of methods of these types in public health information infrastructures for 'nowcasting' (integrated detection and prediction) of influenza activity are warranted.
A computational environment for long-term multi-feature and multi-algorithm seizure prediction.

Science.gov (United States)

Teixeira, C A; Direito, B; Costa, R P; Valderrama, M; Feldwisch-Drentrup, H; Nikolopoulos, S; Le Van Quyen, M; Schelter, B; Dourado, A

2010-01-01

The daily life of epilepsy patients is constrained by the possibility of occurrence of seizures. Until now, seizures cannot be predicted with sufficient sensitivity and specificity. Most of the seizure prediction studies have been focused on a small number of patients, and frequently assuming unrealistic hypothesis. This paper adopts the view that for an appropriate development of reliable predictors one should consider long-term recordings and several features and algorithms integrated in one software tool. A computational environment, based on Matlab (®), is presented, aiming to be an innovative tool for seizure prediction. It results from the need of a powerful and flexible tool for long-term EEG/ECG analysis by multiple features and algorithms. After being extracted, features can be subjected to several reduction and selection methods, and then used for prediction. The predictions can be conducted based on optimized thresholds or by applying computational intelligence methods. One important aspect is the integrated evaluation of the seizure prediction characteristic of the developed predictors.

SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

Directory of Open Access Journals (Sweden)

Xiaoxia Yang

Full Text Available Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

Science.gov (United States)

Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

2015-01-01

Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
Analysis of longitudinal variations in North Pacific alkalinity to improve predictive algorithms

Science.gov (United States)

Fry, Claudia H.; Tyrrell, Toby; Achterberg, Eric P.

2016-10-01

The causes of natural variation in alkalinity in the North Pacific surface ocean need to be investigated to understand the carbon cycle and to improve predictive algorithms. We used GLODAPv2 to test hypotheses on the causes of three longitudinal phenomena in Alk*, a tracer of calcium carbonate cycling. These phenomena are (a) an increase from east to west between 45°N and 55°N, (b) an increase from west to east between 25°N and 40°N, and (c) a minor increase from west to east in the equatorial upwelling region. Between 45°N and 55°N, Alk* is higher on the western than on the eastern side, and this is associated with denser isopycnals with higher Alk* lying at shallower depths. Between 25°N and 40°N, upwelling along the North American continental shelf causes higher Alk* in the east. Along the equator, a strong east-west trend was not observed, even though the upwelling on the eastern side of the basin is more intense, because the water brought to the surface is not high in Alk*. We created two algorithms to predict alkalinity, one for the entire Pacific Ocean north of 30°S and one for the eastern margin. The Pacific Ocean algorithm is more accurate than the commonly used algorithm published by Lee et al. (2006), of similar accuracy to the best previously published algorithm by Sasse et al. (2013), and is less biased with longitude than other algorithms in the subpolar North Pacific. Our eastern margin algorithm is more accurate than previously published algorithms.
An improved simplified model predictive control algorithm and its application to a continuous fermenter

Directory of Open Access Journals (Sweden)

W. H. Kwong

2000-06-01

Full Text Available The development of a new simplified model predictive control algorithm has been proposed in this work. The algorithm is developed within the framework of internal model control, and it is easy to understanding and implement. Simulation results for a continuous fermenter, which show that the proposed control algorithm is robust for moderate variations in plant parameters, are presented. The algorithm shows a good performance for setpoint tracking.
A time series based sequence prediction algorithm to detect activities of daily living in smart home.

Science.gov (United States)

Marufuzzaman, M; Reaz, M B I; Ali, M A M; Rahman, L F

2015-01-01

The goal of smart homes is to create an intelligent environment adapting the inhabitants need and assisting the person who needs special care and safety in their daily life. This can be reached by collecting the ADL (activities of daily living) data and further analysis within existing computing elements. In this research, a very recent algorithm named sequence prediction via enhanced episode discovery (SPEED) is modified and in order to improve accuracy time component is included. The modified SPEED or M-SPEED is a sequence prediction algorithm, which modified the previous SPEED algorithm by using time duration of appliance's ON-OFF states to decide the next state. M-SPEED discovered periodic episodes of inhabitant behavior, trained it with learned episodes, and made decisions based on the obtained knowledge. The results showed that M-SPEED achieves 96.8% prediction accuracy, which is better than other time prediction algorithms like PUBS, ALZ with temporal rules and the previous SPEED. Since human behavior shows natural temporal patterns, duration times can be used to predict future events more accurately. This inhabitant activity prediction system will certainly improve the smart homes by ensuring safety and better care for elderly and handicapped people.
Nonlinear model predictive control theory and algorithms

CERN Document Server

Grüne, Lars

2017-01-01

This book offers readers a thorough and rigorous introduction to nonlinear model predictive control (NMPC) for discrete-time and sampled-data systems. NMPC schemes with and without stabilizing terminal constraints are detailed, and intuitive examples illustrate the performance of different NMPC variants. NMPC is interpreted as an approximation of infinite-horizon optimal control so that important properties like closed-loop stability, inverse optimality and suboptimality can be derived in a uniform manner. These results are complemented by discussions of feasibility and robustness. An introduction to nonlinear optimal control algorithms yields essential insights into how the nonlinear optimization routine—the core of any nonlinear model predictive controller—works. Accompanying software in MATLAB® and C++ (downloadable from extras.springer.com/), together with an explanatory appendix in the book itself, enables readers to perform computer experiments exploring the possibilities and limitations of NMPC. T...
Application of a genetic algorithm in the conformational analysis of methylene-acetal-linked thymine dimers in DNA: Comparison with distance geometry calculations

International Nuclear Information System (INIS)

Beckers, Mischa L.M.; Buydens, Lutgarde M.C.; Pikkemaat, Jeroen A.; Altona, Cornelis

1997-01-01

The three-dimensional spatial structure of a methylene-acetal-linked thymine dimer present in a 10 base-pair (bp) sense-antisense DNA duplex was studied with a genetic algorithm designed to interpret NOE distance restraints. Trial solutions were represented by torsion angles. This means that bond angles for the dimer trial structures are kept fixed during the genetic algorithm optimization. Bond angle values were extracted from a 10 bp sense-antisense duplex model that was subjected to energy minimization by means of a modified AMBER force field. A set of 63 proton-proton distance restraints defining the methylene-acetal-linked thymine dimer was available. The genetic algorithm minimizes the difference between distances in the trial structures and distance restraints. A large conformational search space could be covered in the genetic algorithm optimization by allowing a wide range of torsion angles. The genetic algorithm optimization in all cases led to one family of structures. This family of the methylene-acetal-linked thymine dimer in the duplex differs from the family that was suggested from distance geometry calculations. It is demonstrated that the bond angle geometry around the methylene-acetal linkage plays an important role in the optimization
Multi-agent cooperation rescue algorithm based on influence degree and state prediction

Science.gov (United States)

Zheng, Yanbin; Ma, Guangfu; Wang, Linlin; Xi, Pengxue

2018-04-01

Aiming at the multi-agent cooperative rescue in disaster, a multi-agent cooperative rescue algorithm based on impact degree and state prediction is proposed. Firstly, based on the influence of the information in the scene on the collaborative task, the influence degree function is used to filter the information. Secondly, using the selected information to predict the state of the system and Agent behavior. Finally, according to the result of the forecast, the cooperative behavior of Agent is guided and improved the efficiency of individual collaboration. The simulation results show that this algorithm can effectively solve the cooperative rescue problem of multi-agent and ensure the efficient completion of the task.
Training the Recurrent neural network by the Fuzzy Min-Max algorithm for fault prediction

International Nuclear Information System (INIS)

Zemouri, Ryad; Racoceanu, Daniel; Zerhouni, Noureddine; Minca, Eugenia; Filip, Florin

2009-01-01

In this paper, we present a training technique of a Recurrent Radial Basis Function neural network for fault prediction. We use the Fuzzy Min-Max technique to initialize the k-center of the RRBF neural network. The k-means algorithm is then applied to calculate the centers that minimize the mean square error of the prediction task. The performances of the k-means algorithm are then boosted by the Fuzzy Min-Max technique.
Historical feature pattern extraction based network attack situation sensing algorithm.

Science.gov (United States)

Zeng, Yong; Liu, Dacheng; Lei, Zhou

2014-01-01

The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.
Historical Feature Pattern Extraction Based Network Attack Situation Sensing Algorithm

Directory of Open Access Journals (Sweden)

Yong Zeng

2014-01-01

Full Text Available The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE. First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.
Proportional-Type Performance Recovery DC-Link Voltage Tracking Algorithm for Permanent Magnet Synchronous Generators

Directory of Open Access Journals (Sweden)

Seok-Kyoon Kim

2017-09-01

Full Text Available This study proposes a disturbance observer-based proportional-type DC-link voltage tracking algorithm for permanent magnet synchronous generators (PMSGs. The proposed technique feedbacks the only proportional term of the tracking errors, and it contains the nominal static and dynamic feed-forward compensators coming from the first-order disturbance observers. It is rigorously proved that the proposed method ensures the performance recovery and offset-free properties without the use of the integrators of the tracking errors. A wind power generation system has been simulated to verify the efficacy of the proposed method using the PSIM (PowerSIM software with the DLL (Dynamic Link Library block.
Mobility-Assisted on-Demand Routing Algorithm for MANETs in the Presence of Location Errors

Directory of Open Access Journals (Sweden)

Trung Kien Vu

2014-01-01

Full Text Available We propose a mobility-assisted on-demand routing algorithm for mobile ad hoc networks in the presence of location errors. Location awareness enables mobile nodes to predict their mobility and enhances routing performance by estimating link duration and selecting reliable routes. However, measured locations intrinsically include errors in measurement. Such errors degrade mobility prediction and have been ignored in previous work. To mitigate the impact of location errors on routing, we propose an on-demand routing algorithm taking into account location errors. To that end, we adopt the Kalman filter to estimate accurate locations and consider route confidence in discovering routes. Via simulations, we compare our algorithm and previous algorithms in various environments. Our proposed mobility prediction is robust to the location errors.
Predicting patchy particle crystals: variable box shape simulations and evolutionary algorithms.

Science.gov (United States)

Bianchi, Emanuela; Doppelbauer, Günther; Filion, Laura; Dijkstra, Marjolein; Kahl, Gerhard

2012-06-07

We consider several patchy particle models that have been proposed in literature and we investigate their candidate crystal structures in a systematic way. We compare two different algorithms for predicting crystal structures: (i) an approach based on Monte Carlo simulations in the isobaric-isothermal ensemble and (ii) an optimization technique based on ideas of evolutionary algorithms. We show that the two methods are equally successful and provide consistent results on crystalline phases of patchy particle systems.
Adaptive algorithm for predicting increases in central loads of electrical energy systems

Energy Technology Data Exchange (ETDEWEB)

Arbachyauskene, N A; Pushinaytis, K V

1982-01-01

An adaptive algorithm for predicting increases in central loads of the electrical energy system is suggested for the task of evaluating the condition. The algorithm is based on the Kalman filter. In order to calculate the coefficient of intensification, the a priori assigned noise characteristics with low accuracy are used only in the beginning of the calculation. Further, the coefficient of intensification is calculated from the innovation sequence. This approach makes it possible to correct errors in the assignment of the statistical noise characteristics and to follow their changes. The algorithm is experimentally verified.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors

Directory of Open Access Journals (Sweden)

Ricardo Acevedo-Avila

2016-05-01

Full Text Available Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

Science.gov (United States)

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-05-28

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
Advertisement Click-Through Rate Prediction Based on the Weighted-ELM and Adaboost Algorithm

Directory of Open Access Journals (Sweden)

Sen Zhang

2017-01-01

Full Text Available Accurate click-through rate (CTR prediction can not only improve the advertisement company’s reputation and revenue, but also help the advertisers to optimize the advertising performance. There are two main unsolved problems of the CTR prediction: low prediction accuracy due to the imbalanced distribution of the advertising data and the lack of the real-time advertisement bidding implementation. In this paper, we will develop a novel online CTR prediction approach by incorporating the real-time bidding (RTB advertising by the following strategies: user profile system is constructed from the historical data of the RTB advertising to describe the user features, the historical CTR features, the ID features, and the other numerical features. A novel CTR prediction approach is presented to address the imbalanced learning sample distribution by integrating the Weighted-ELM (WELM and the Adaboost algorithm. Compared to the commonly used algorithms, the proposed approach can improve the CTR significantly.
A multi-frame particle tracking algorithm robust against input noise

International Nuclear Information System (INIS)

Li, Dongning; Zhang, Yuanhui; Sun, Yigang; Yan, Wei

2008-01-01

The performance of a particle tracking algorithm which detects particle trajectories from discretely recorded particle positions could be substantially hindered by the input noise. In this paper, a particle tracking algorithm is developed which is robust against input noise. This algorithm employs the regression method instead of the extrapolation method usually employed by existing algorithms to predict future particle positions. If a trajectory cannot be linked to a particle at a frame, the algorithm can still proceed by trying to find a candidate at the next frame. The connectivity of tracked trajectories is inspected to remove the false ones. The algorithm is validated with synthetic data. The result shows that the algorithm is superior to traditional algorithms in the aspect of tracking long trajectories
A Decomposition Algorithm for Mean-Variance Economic Model Predictive Control of Stochastic Linear Systems

DEFF Research Database (Denmark)

Sokoler, Leo Emil; Dammann, Bernd; Madsen, Henrik

2014-01-01

This paper presents a decomposition algorithm for solving the optimal control problem (OCP) that arises in Mean-Variance Economic Model Predictive Control of stochastic linear systems. The algorithm applies the alternating direction method of multipliers to a reformulation of the OCP...

A novel multilayer model for missing link prediction and future link forecasting in dynamic complex networks

Science.gov (United States)

Yasami, Yasser; Safaei, Farshad

2018-02-01

The traditional complex network theory is particularly focused on network models in which all network constituents are dealt with equivalently, while fail to consider the supplementary information related to the dynamic properties of the network interactions. This is a main constraint leading to incorrect descriptions of some real-world phenomena or incomplete capturing the details of certain real-life problems. To cope with the problem, this paper addresses the multilayer aspects of dynamic complex networks by analyzing the properties of intrinsically multilayered co-authorship networks, DBLP and Astro Physics, and presenting a novel multilayer model of dynamic complex networks. The model examines the layers evolution (layers birth/death process and lifetime) throughout the network evolution. Particularly, this paper models the evolution of each node's membership in different layers by an Infinite Factorial Hidden Markov Model considering feature cascade, and thereby formulates the link generation process for intra-layer and inter-layer links. Although adjacency matrixes are useful to describe the traditional single-layer networks, such a representation is not sufficient to describe and analyze the multilayer dynamic networks. This paper also extends a generalized mathematical infrastructure to address the problems issued by multilayer complex networks. The model inference is performed using some Markov Chain Monte Carlo sampling strategies, given synthetic and real complex networks data. Experimental results indicate a tremendous improvement in the performance of the proposed multilayer model in terms of sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios, F1-score, Matthews correlation coefficient, and accuracy for two important applications of missing link prediction and future link forecasting. The experimental results also indicate the strong predictivepower of the proposed model for the application of
Research on Demand Prediction of Fresh Food Supply Chain Based on Improved Particle Swarm Optimization Algorithm

OpenAIRE

He Wang

2015-01-01

Demand prediction of supply chain is an important content and the first premise in supply management of different enterprises and has become one of the difficulties and hot research fields for the researchers related. The paper takes fresh food demand prediction for example and presents a new algorithm for predicting demand of fresh food supply chain. First, the working principle and the root causes of the defects of particle swarm optimization algorithm are analyzed in the study; Second, the...
Comparison of genetic algorithm and imperialist competitive algorithms in predicting bed load transport in clean pipe.

Science.gov (United States)

Ebtehaj, Isa; Bonakdari, Hossein

2014-01-01

The existence of sediments in wastewater greatly affects the performance of the sewer and wastewater transmission systems. Increased sedimentation in wastewater collection systems causes problems such as reduced transmission capacity and early combined sewer overflow. The article reviews the performance of the genetic algorithm (GA) and imperialist competitive algorithm (ICA) in minimizing the target function (mean square error of observed and predicted Froude number). To study the impact of bed load transport parameters, using four non-dimensional groups, six different models have been presented. Moreover, the roulette wheel selection method is used to select the parents. The ICA with root mean square error (RMSE) = 0.007, mean absolute percentage error (MAPE) = 3.5% show better results than GA (RMSE = 0.007, MAPE = 5.6%) for the selected model. All six models return better results than the GA. Also, the results of these two algorithms were compared with multi-layer perceptron and existing equations.
Application of Functional Link Artificial Neural Network for Prediction of Machinery Noise in Opencast Mines

Directory of Open Access Journals (Sweden)

Santosh Kumar Nanda

2011-01-01

Full Text Available Functional link-based neural network models were applied to predict opencast mining machineries noise. The paper analyzes the prediction capabilities of functional link neural network based noise prediction models vis-à-vis existing statistical models. In order to find the actual noise status in opencast mines, some of the popular noise prediction models, for example, ISO-9613-2, CONCAWE, VDI, and ENM, have been applied in mining and allied industries to predict the machineries noise by considering various attenuation factors. Functional link artificial neural network (FLANN, polynomial perceptron network (PPN, and Legendre neural network (LeNN were used to predict the machinery noise in opencast mines. The case study is based on data collected from an opencast coal mine of Orissa, India. From the present investigations, it could be concluded that the FLANN model give better noise prediction than the PPN and LeNN model.
HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

Directory of Open Access Journals (Sweden)

Xiaomin Wang

2011-01-01

Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

Directory of Open Access Journals (Sweden)

Xiao-Lin Wu

Full Text Available Low-density (LD single nucleotide polymorphism (SNP arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD or high-density (HD SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE or haplotype-averaged Shannon entropy (HASE and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus
Investigating the link between radiologists’ gaze, diagnostic decision, and image content

Science.gov (United States)

Tourassi, Georgia; Voisin, Sophie; Paquit, Vincent; Krupinski, Elizabeth

2013-01-01

Objective To investigate machine learning for linking image content, human perception, cognition, and error in the diagnostic interpretation of mammograms. Methods Gaze data and diagnostic decisions were collected from three breast imaging radiologists and three radiology residents who reviewed 20 screening mammograms while wearing a head-mounted eye-tracker. Image analysis was performed in mammographic regions that attracted radiologists’ attention and in all abnormal regions. Machine learning algorithms were investigated to develop predictive models that link: (i) image content with gaze, (ii) image content and gaze with cognition, and (iii) image content, gaze, and cognition with diagnostic error. Both group-based and individualized models were explored. Results By pooling the data from all readers, machine learning produced highly accurate predictive models linking image content, gaze, and cognition. Potential linking of those with diagnostic error was also supported to some extent. Merging readers’ gaze metrics and cognitive opinions with computer-extracted image features identified 59% of the readers’ diagnostic errors while confirming 97.3% of their correct diagnoses. The readers’ individual perceptual and cognitive behaviors could be adequately predicted by modeling the behavior of others. However, personalized tuning was in many cases beneficial for capturing more accurately individual behavior. Conclusions There is clearly an interaction between radiologists’ gaze, diagnostic decision, and image content which can be modeled with machine learning algorithms. PMID:23788627
The fatigue life prediction of aluminium alloy using genetic algorithm and neural network

Science.gov (United States)

Susmikanti, Mike

2013-09-01

The behavior of the fatigue life of the industrial materials is very important. In many cases, the material with experiencing fatigue life cannot be avoided, however, there are many ways to control their behavior. Many investigations of the fatigue life phenomena of alloys have been done, but it is high cost and times consuming computation. This paper report the modeling and simulation approaches to predict the fatigue life behavior of Aluminum Alloys and resolves some problems of computation. First, the simulation using genetic algorithm was utilized to optimize the load to obtain the stress values. These results can be used to provide N-cycle fatigue life of the material. Furthermore, the experimental data was applied as input data in the neural network learning, while the samples data were applied for testing of the training data. Finally, the multilayer perceptron algorithm is applied to predict whether the given data sets in accordance with the fatigue life of the alloy. To achieve rapid convergence, the Levenberg-Marquardt algorithm was also employed. The simulations results shows that the fatigue behaviors of aluminum under pressure can be predicted. In addition, implementation of neural networks successfully identified a model for material fatigue life.
Predicting patchy particle crystals: variable box shape simulations and evolutionary algorithms

NARCIS (Netherlands)

Bianchi, E.; Doppelbauer, G.; Filion, L.C.; Dijkstra, M.; Kahl, G.

2012-01-01

We consider several patchy particle models that have been proposed in literature and we investigate their candidate crystal structures in a systematic way. We compare two different algorithms for predicting crystal structures: (i) an approach based on Monte Carlo simulations in the
A probabilistic fragment-based protein structure prediction algorithm.

Directory of Open Access Journals (Sweden)

David Simoncini

Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].
Recognition algorithms in knot theory

International Nuclear Information System (INIS)

Dynnikov, I A

2003-01-01

In this paper the problem of constructing algorithms for comparing knots and links is discussed. A survey of existing approaches and basic results in this area is given. In particular, diverse combinatorial methods for representing links are discussed, the Haken algorithm for recognizing a trivial knot (the unknot) and a scheme for constructing a general algorithm (using Haken's ideas) for comparing links are presented, an approach based on representing links by closed braids is described, the known algorithms for solving the word problem and the conjugacy problem for braid groups are described, and the complexity of the algorithms under consideration is discussed. A new method of combinatorial description of knots is given together with a new algorithm (based on this description) for recognizing the unknot by using a procedure for monotone simplification. In the conclusion of the paper several problems are formulated whose solution could help to advance towards the 'algorithmization' of knot theory
Beam-column joint shear prediction using hybridized deep learning neural network with genetic algorithm

Science.gov (United States)

Mundher Yaseen, Zaher; Abdulmohsin Afan, Haitham; Tran, Minh-Tung

2018-04-01

Scientifically evidenced that beam-column joints are a critical point in the reinforced concrete (RC) structure under the fluctuation loads effects. In this novel hybrid data-intelligence model developed to predict the joint shear behavior of exterior beam-column structure frame. The hybrid data-intelligence model is called genetic algorithm integrated with deep learning neural network model (GA-DLNN). The genetic algorithm is used as prior modelling phase for the input approximation whereas the DLNN predictive model is used for the prediction phase. To demonstrate this structural problem, experimental data is collected from the literature that defined the dimensional and specimens’ properties. The attained findings evidenced the efficitveness of the hybrid GA-DLNN in modelling beam-column joint shear problem. In addition, the accurate prediction achived with less input variables owing to the feasibility of the evolutionary phase.
A new spirometry-based algorithm to predict occupational pulmonary restrictive impairment.

Science.gov (United States)

De Matteis, S; Iridoy-Zulet, A A; Aaron, S; Swann, A; Cullinan, P

2016-01-01

Spirometry is often included in workplace-based respiratory surveillance programmes but its performance in the identification of restrictive lung disease is poor, especially when the prevalence of this condition is low in the tested population. To improve the specificity (Sp) and positive predictive value (PPV) of current spirometry-based algorithms in the diagnosis of restrictive pulmonary impairment in the workplace and to reduce the proportion of false positives findings and, as a result, unnecessary referrals for lung volume measurements. We re-analysed two studies of hospital patients, respectively used to derive and validate a recommended spirometry-based algorithm [forced vital capacity (FVC) 55%] for the recognition of restrictive pulmonary impairment. We used true lung restrictive cases as a reference standard in 2×2 contingency tables to estimate sensitivity (Sn), Sp and PPV and negative predictive values for each diagnostic cut-off. We simulated a working population aged spirometry-based algorithm may be adopted to accurately exclude pulmonary restriction and to possibly reduce unnecessary lung volume testing in an occupational health setting. © The Author 2015. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Intermediate-term medium-range earthquake prediction algorithm M8: A new spatially stabilized application in Italy

International Nuclear Information System (INIS)

Romashkova, L.L.; Kossobokov, V.G.; Peresan, A.; Panza, G.F.

2001-12-01

A series of experiments, based on the intermediate-term earthquake prediction algorithm M8, has been performed for the retrospective simulation of forward predictions in the Italian territory, with the aim to design an experimental routine for real-time predictions. These experiments evidenced two main difficulties for the application of M8 in Italy. The first one is due to the fact that regional catalogues are usually limited in space. The second one concerns certain arbitrariness and instability, with respect to the positioning of the circles of investigation. Here we design a new scheme for the application of the algorithm M8, which is less subjective and less sensitive to the position of the circles of investigation. To perform this test, we consider a recent revision of the Italian catalogue, named UCI2001, composed by CCI1996, NEIC and ALPOR data for the period 1900-1985, and updated with the NEIC reduces the spatial heterogeneity of the data at the boundaries of Italy. The new variant of the M8 algorithm application reduces the number of spurious alarms and increases the reliability of predictions. As a result, three out of four earthquakes with magnitude M max larger than 6.0 are predicted in the retrospective simulation of the forward prediction, during the period 1972-2001, with a space-time volume of alarms comparable to that obtained with the non-stabilized variant of the M8 algorithm in Italy. (author)
Investigating the Link Between Radiologists Gaze, Diagnostic Decision, and Image Content

Energy Technology Data Exchange (ETDEWEB)

Tourassi, Georgia [ORNL; Voisin, Sophie [ORNL; Paquit, Vincent C [ORNL; Krupinski, Elizabeth [University of Arizona

2013-01-01

Objective: To investigate machine learning for linking image content, human perception, cognition, and error in the diagnostic interpretation of mammograms. Methods: Gaze data and diagnostic decisions were collected from six radiologists who reviewed 20 screening mammograms while wearing a head-mounted eye-tracker. Texture analysis was performed in mammographic regions that attracted radiologists attention and in all abnormal regions. Machine learning algorithms were investigated to develop predictive models that link: (i) image content with gaze, (ii) image content and gaze with cognition, and (iii) image content, gaze, and cognition with diagnostic error. Both group-based and individualized models were explored. Results: By pooling the data from all radiologists machine learning produced highly accurate predictive models linking image content, gaze, cognition, and error. Merging radiologists gaze metrics and cognitive opinions with computer-extracted image features identified 59% of the radiologists diagnostic errors while confirming 96.2% of their correct diagnoses. The radiologists individual errors could be adequately predicted by modeling the behavior of their peers. However, personalized tuning appears to be beneficial in many cases to capture more accurately individual behavior. Conclusions: Machine learning algorithms combining image features with radiologists gaze data and diagnostic decisions can be effectively developed to recognize cognitive and perceptual errors associated with the diagnostic interpretation of mammograms.
A grand canonical genetic algorithm for the prediction of multi-component phase diagrams and testing of empirical potentials

International Nuclear Information System (INIS)

Tipton, William W; Hennig, Richard G

2013-01-01

We present an evolutionary algorithm which predicts stable atomic structures and phase diagrams by searching the energy landscape of empirical and ab initio Hamiltonians. Composition and geometrical degrees of freedom may be varied simultaneously. We show that this method utilizes information from favorable local structure at one composition to predict that at others, achieving far greater efficiency of phase diagram prediction than a method which relies on sampling compositions individually. We detail this and a number of other efficiency-improving techniques implemented in the genetic algorithm for structure prediction code that is now publicly available. We test the efficiency of the software by searching the ternary Zr–Cu–Al system using an empirical embedded-atom model potential. In addition to testing the algorithm, we also evaluate the accuracy of the potential itself. We find that the potential stabilizes several correct ternary phases, while a few of the predicted ground states are unphysical. Our results suggest that genetic algorithm searches can be used to improve the methodology of empirical potential design. (paper)
A grand canonical genetic algorithm for the prediction of multi-component phase diagrams and testing of empirical potentials.

Science.gov (United States)

Tipton, William W; Hennig, Richard G

2013-12-11

We present an evolutionary algorithm which predicts stable atomic structures and phase diagrams by searching the energy landscape of empirical and ab initio Hamiltonians. Composition and geometrical degrees of freedom may be varied simultaneously. We show that this method utilizes information from favorable local structure at one composition to predict that at others, achieving far greater efficiency of phase diagram prediction than a method which relies on sampling compositions individually. We detail this and a number of other efficiency-improving techniques implemented in the genetic algorithm for structure prediction code that is now publicly available. We test the efficiency of the software by searching the ternary Zr-Cu-Al system using an empirical embedded-atom model potential. In addition to testing the algorithm, we also evaluate the accuracy of the potential itself. We find that the potential stabilizes several correct ternary phases, while a few of the predicted ground states are unphysical. Our results suggest that genetic algorithm searches can be used to improve the methodology of empirical potential design.
An outlook on robust model predictive control algorithms : Reflections on performance and computational aspects

NARCIS (Netherlands)

Saltik, M.B.; Özkan, L.; Ludlage, J.H.A.; Weiland, S.; Van den Hof, P.M.J.

2018-01-01

In this paper, we discuss the model predictive control algorithms that are tailored for uncertain systems. Robustness notions with respect to both deterministic (or set based) and stochastic uncertainties are discussed and contributions are reviewed in the model predictive control literature. We
PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast.

Science.gov (United States)

Lai, Fu-Jou; Chang, Hong-Tsun; Wu, Wei-Sheng

2015-01-01

Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Allowing users to select eight existing performance indices and 15
WINROP algorithm for prediction of sight threatening retinopathy of prematurity: Initial experience in Indian preterm infants

Directory of Open Access Journals (Sweden)

Gaurav Sanghi

2018-01-01

Full Text Available Purpose: To determine the efficacy of the online monitoring tool, WINROP (https://winrop.com/ in detecting sight-threatening type 1 retinopathy of prematurity (ROP in Indian preterm infants. Methods: Birth weight, gestational age, and weekly weight measurements of seventy preterm infants (<32 weeks gestation born between June 2014 and August 2016 were entered into WINROP algorithm. Based on weekly weight gain, WINROP algorithm signaled an alarm to indicate that the infant is at risk for sight-threatening Type 1 ROP. ROP screening was done according to standard guidelines. The negative and positive predictive values were calculated using the sensitivity, specificity, and prevalence of ROP type 1 for the study group. 95% confidence interval (CI was calculated. Results: Of the seventy infants enrolled in the study, 31 (44.28% developed Type 1 ROP. WINROP alarm was signaled in 74.28% (52/70 of all infants and 90.32% (28/31 of infants treated for Type 1 ROP. The specificity was 38.46% (15/39. The positive predictive value was 53.84% (95% CI: 39.59–67.53 and negative predictive value was 83.3% (95% CI: 57.73–95.59. Conclusion: This is the first study from India using a weight gain-based algorithm for prediction of ROP. Overall sensitivity of WINROP algorithm in detecting Type 1 ROP was 90.32%. The overall specificity was 38.46%. Population-specific tweaking of algorithm may improve the result and practical utility for ophthalmologists and neonatologists.

Accuracy test for link prediction in terms of similarity index: The case of WS and BA models

Science.gov (United States)

Ahn, Min-Woo; Jung, Woo-Sung

2015-07-01

Link prediction is a technique that uses the topological information in a given network to infer the missing links in it. Since past research on link prediction has primarily focused on enhancing performance for given empirical systems, negligible attention has been devoted to link prediction with regard to network models. In this paper, we thus apply link prediction to two network models: The Watts-Strogatz (WS) model and Barabási-Albert (BA) model. We attempt to gain a better understanding of the relation between accuracy and each network parameter (mean degree, the number of nodes and the rewiring probability in the WS model) through network models. Six similarity indices are used, with precision and area under the ROC curve (AUC) value as the accuracy metrics. We observe a positive correlation between mean degree and accuracy, and size independence of the AUC value.
ANNIT - An Efficient Inversion Algorithm based on Prediction Principles

Science.gov (United States)

Růžek, B.; Kolář, P.

2009-04-01

Solution of inverse problems represents meaningful job in geophysics. The amount of data is continuously increasing, methods of modeling are being improved and the computer facilities are also advancing great technical progress. Therefore the development of new and efficient algorithms and computer codes for both forward and inverse modeling is still up to date. ANNIT is contributing to this stream since it is a tool for efficient solution of a set of non-linear equations. Typical geophysical problems are based on parametric approach. The system is characterized by a vector of parameters p, the response of the system is characterized by a vector of data d. The forward problem is usually represented by unique mapping F(p)=d. The inverse problem is much more complex and the inverse mapping p=G(d) is available in an analytical or closed form only exceptionally and generally it may not exist at all. Technically, both forward and inverse mapping F and G are sets of non-linear equations. ANNIT solves such situation as follows: (i) joint subspaces {pD, pM} of original data and model spaces D, M, resp. are searched for, within which the forward mapping F is sufficiently smooth that the inverse mapping G does exist, (ii) numerical approximation of G in subspaces {pD, pM} is found, (iii) candidate solution is predicted by using this numerical approximation. ANNIT is working in an iterative way in cycles. The subspaces {pD, pM} are searched for by generating suitable populations of individuals (models) covering data and model spaces. The approximation of the inverse mapping is made by using three methods: (a) linear regression, (b) Radial Basis Function Network technique, (c) linear prediction (also known as "Kriging"). The ANNIT algorithm has built in also an archive of already evaluated models. Archive models are re-used in a suitable way and thus the number of forward evaluations is minimized. ANNIT is now implemented both in MATLAB and SCILAB. Numerical tests show good
Predicting the growth of new links by new preferential attachment ...

Indian Academy of Sciences (India)

2014-03-07

Mar 7, 2014 ... ... Science and Engineering, Central University of Finance and Economics, ... nism of network evolution, and also for predicting the growth of new links, without .... the high voltage transmission lines between them. ..... 6104104, 11147121 and 61104143), the Scientific Research Fund of Education Depart-.
An Improved Shuffled Frog Leaping Algorithm and Its Application in Dynamic Emergency Vehicle Dispatching

Directory of Open Access Journals (Sweden)

Xiaohong Duan

2018-01-01

Full Text Available The traditional method for solving the dynamic emergency vehicle dispatching problem can only get a local optimal strategy in each horizon. In order to obtain the dispatching strategy that can better respond to changes in road conditions during the whole dispatching process, the real-time and time-dependent link travel speeds are fused, and a time-dependent polygonal-shaped link travel speed function is set up to simulate the predictable changes in road conditions. Response times, accident severity, and accident time windows are taken as key factors to build an emergency vehicle dispatching model integrating dynamic emergency vehicle routing and selection. For the unpredictable changes in road conditions caused by accidents, the dispatching strategy is adjusted based on the real-time link travel speed. In order to solve the dynamic emergency vehicle dispatching model, an improved shuffled frog leaping algorithm (ISFLA is proposed. The global search of the improved algorithm uses the probability model of estimation of distribution algorithm to avoid the partial optimal solution. Based on the Beijing expressway network, the efficacy of the model and the improved algorithm were tested from three aspects. The results have shown the following: (1 Compared with SFLA, the optimization performance of ISFLA is getting better and better with the increase of the number of decision variables. When the possible emergency vehicle selection strategies are 815, the objective function value of optimal selection strategies obtained by the base algorithm is 210.10% larger than that of ISFLA. (2 The prediction error of the travel speed affects the accuracy of the initial emergency vehicle dispatching. The prediction error of ±10 can basically meet the requirements of the initial dispatching. (3 The adjustment of emergency vehicle dispatching strategy can successfully bypassed road sections affected by accidents and shorten the response time.
Algorithm for predicting the evolution of series of dynamics of complex systems in solving information problems

Science.gov (United States)

Kasatkina, T. I.; Dushkin, A. V.; Pavlov, V. A.; Shatovkin, R. R.

2018-03-01

In the development of information, systems and programming to predict the series of dynamics, neural network methods have recently been applied. They are more flexible, in comparison with existing analogues and are capable of taking into account the nonlinearities of the series. In this paper, we propose a modified algorithm for predicting the series of dynamics, which includes a method for training neural networks, an approach to describing and presenting input data, based on the prediction by the multilayer perceptron method. To construct a neural network, the values of a series of dynamics at the extremum points and time values corresponding to them, formed based on the sliding window method, are used as input data. The proposed algorithm can act as an independent approach to predicting the series of dynamics, and be one of the parts of the forecasting system. The efficiency of predicting the evolution of the dynamics series for a short-term one-step and long-term multi-step forecast by the classical multilayer perceptron method and a modified algorithm using synthetic and real data is compared. The result of this modification was the minimization of the magnitude of the iterative error that arises from the previously predicted inputs to the inputs to the neural network, as well as the increase in the accuracy of the iterative prediction of the neural network.
Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

Directory of Open Access Journals (Sweden)

Xin Yi Ng

2015-01-01

Full Text Available This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM- LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity.
Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations

Directory of Open Access Journals (Sweden)

Faming Zhang

2016-11-01

Full Text Available The prediction of travel times is challenging because of the sparseness of real-time traffic data and the intrinsic uncertainty of travel on congested urban road networks. We propose a new gradient–boosted regression tree method to accurately predict travel times. This model accounts for spatiotemporal correlations extracted from historical and real-time traffic data for adjacent and target links. This method can deliver high prediction accuracy by combining simple regression trees with poor performance. It corrects the error found in existing models for improved prediction accuracy. Our spatiotemporal gradient–boosted regression tree model was verified in experiments. The training data were obtained from big data reflecting historic traffic conditions collected by probe vehicles in Wuhan from January to May 2014. Real-time data were extracted from 11 weeks of GPS records collected in Wuhan from 5 May 2014 to 20 July 2014. Based on these data, we predicted link travel time for the period from 21 July 2014 to 25 July 2014. Experiments showed that our proposed spatiotemporal gradient–boosted regression tree model obtained better results than gradient boosting, random forest, or autoregressive integrated moving average approaches. Furthermore, these results indicate the advantages of our model for urban link travel time prediction.
Research on prediction of agricultural machinery total power based on grey model optimized by genetic algorithm

Science.gov (United States)

Xie, Yan; Li, Mu; Zhou, Jin; Zheng, Chang-zheng

2009-07-01

Agricultural machinery total power is an important index to reflex and evaluate the level of agricultural mechanization. It is the power source of agricultural production, and is the main factors to enhance the comprehensive agricultural production capacity expand production scale and increase the income of the farmers. Its demand is affected by natural, economic, technological and social and other "grey" factors. Therefore, grey system theory can be used to analyze the development of agricultural machinery total power. A method based on genetic algorithm optimizing grey modeling process is introduced in this paper. This method makes full use of the advantages of the grey prediction model and characteristics of genetic algorithm to find global optimization. So the prediction model is more accurate. According to data from a province, the GM (1, 1) model for predicting agricultural machinery total power was given based on the grey system theories and genetic algorithm. The result indicates that the model can be used as agricultural machinery total power an effective tool for prediction.
A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

Directory of Open Access Journals (Sweden)

Ruzzo Walter L

2006-03-01

Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.
On the best learning algorithm for web services response time prediction

DEFF Research Database (Denmark)

Madsen, Henrik; Albu, Razvan-Daniel; Popentiu-Vladicescu, Florin

2013-01-01

In this article we will examine the effect of different learning algorithms, while training the MLP (Multilayer Perceptron) with the intention of predicting web services response time. Web services do not necessitate a user interface. This may seem contradictory to most people's concept of what...... an application is. A Web service is better imagined as an application "segment," or better as a program enabler. Performance is an important quality aspect of Web services because of their distributed nature. Predicting the response of web services during their operation is very important....
An Automated Defect Prediction Framework using Genetic Algorithms: A Validation of Empirical Studies

Directory of Open Access Journals (Sweden)

Juan Murillo-Morera

2016-05-01

Full Text Available Today, it is common for software projects to collect measurement data through development processes. With these data, defect prediction software can try to estimate the defect proneness of a software module, with the objective of assisting and guiding software practitioners. With timely and accurate defect predictions, practitioners can focus their limited testing resources on higher risk areas. This paper reports the results of three empirical studies that uses an automated genetic defect prediction framework. This framework generates and compares different learning schemes (preprocessing + attribute selection + learning algorithms and selects the best one using a genetic algorithm, with the objective to estimate the defect proneness of a software module. The first empirical study is a performance comparison of our framework with the most important framework of the literature. The second empirical study is a performance and runtime comparison between our framework and an exhaustive framework. The third empirical study is a sensitivity analysis. The last empirical study, is our main contribution in this paper. Performance of the software development defect prediction models (using AUC, Area Under the Curve was validated using NASA-MDP and PROMISE data sets. Seventeen data sets from NASA-MDP (13 and PROMISE (4 projects were analyzed running a NxM-fold cross-validation. A genetic algorithm was used to select the components of the learning schemes automatically, and to assess and report the results. Our results reported similar performance between frameworks. Our framework reported better runtime than exhaustive framework. Finally, we reported the best configuration according to sensitivity analysis.
Development of a generally applicable morphokinetic algorithm capable of predicting the implantation potential of embryos transferred on Day 3

Science.gov (United States)

Petersen, Bjørn Molt; Boel, Mikkel; Montag, Markus; Gardner, David K.

2016-01-01

STUDY QUESTION Can a generally applicable morphokinetic algorithm suitable for Day 3 transfers of time-lapse monitored embryos originating from different culture conditions and fertilization methods be developed for the purpose of supporting the embryologist's decision on which embryo to transfer back to the patient in assisted reproduction? SUMMARY ANSWER The algorithm presented here can be used independently of culture conditions and fertilization method and provides predictive power not surpassed by other published algorithms for ranking embryos according to their blastocyst formation potential. WHAT IS KNOWN ALREADY Generally applicable algorithms have so far been developed only for predicting blastocyst formation. A number of clinics have reported validated implantation prediction algorithms, which have been developed based on clinic-specific culture conditions and clinical environment. However, a generally applicable embryo evaluation algorithm based on actual implantation outcome has not yet been reported. STUDY DESIGN, SIZE, DURATION Retrospective evaluation of data extracted from a database of known implantation data (KID) originating from 3275 embryos transferred on Day 3 conducted in 24 clinics between 2009 and 2014. The data represented different culture conditions (reduced and ambient oxygen with various culture medium strategies) and fertilization methods (IVF, ICSI). The capability to predict blastocyst formation was evaluated on an independent set of morphokinetic data from 11 218 embryos which had been cultured to Day 5. PARTICIPANTS/MATERIALS, SETTING, METHODS The algorithm was developed by applying automated recursive partitioning to a large number of annotation types and derived equations, progressing to a five-fold cross-validation test of the complete data set and a validation test of different incubation conditions and fertilization methods. The results were expressed as receiver operating characteristics curves using the area under the
Study on predicting residual life of elevator links by fracture mechanics approach

Energy Technology Data Exchange (ETDEWEB)

Li Helin; Zhang Yi; Deng Zengjie [China National Petroleum Corp., Xi`an, Shaanxi (China). Tubular Goods Research Center; Jin Dazeng [Xi`an Jiaotong Univ., Xi`an, Shaanxi (China)

1995-12-31

On the basis of investigation, failure and fracture analysis of elevator links, residual life prediction of links using fracture mechanics approach is studied, and mechanical properties, fracture toughness value K{sub IC} and fatigue crack propagation rage da/dN of the steel for elevator links are determined. Using the relation between stress intensity factor K{sub I} and the strain-energy release rate, the two-dimensional conversion thickness finite element method has been used to calculate the stress intensity factors K{sub I} for dangerous sections in the ring part of links. Furthermore, the reliability of calculations of the finite element stress intensity factors K{sub I} for dangerous sections of elevator links and the residual life computation for links are verified by fatigue tests of actual links. Finally, the experimental verification of computed results by 150T link fractured at site indicates that the computed critical crack lengths and residual life tally well with those measured and meet the needs of oil drilling.
MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

Directory of Open Access Journals (Sweden)

Yang Yi-Fan

2007-03-01

Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.
A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. 3: A stochastic rain fade control algorithm for satellite link power via non linear Markow filtering theory

Science.gov (United States)

Manning, Robert M.

1991-01-01

The dynamic and composite nature of propagation impairments that are incurred on Earth-space communications links at frequencies in and above 30/20 GHz Ka band, i.e., rain attenuation, cloud and/or clear air scintillation, etc., combined with the need to counter such degradations after the small link margins have been exceeded, necessitate the use of dynamic statistical identification and prediction processing of the fading signal in order to optimally estimate and predict the levels of each of the deleterious attenuation components. Such requirements are being met in NASA's Advanced Communications Technology Satellite (ACTS) Project by the implementation of optimal processing schemes derived through the use of the Rain Attenuation Prediction Model and nonlinear Markov filtering theory.
BPP: a sequence-based algorithm for branch point prediction.

Science.gov (United States)

Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing

2017-10-15

Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
XTALOPT: An open-source evolutionary algorithm for crystal structure prediction

Science.gov (United States)

Lonie, David C.; Zurek, Eva

2011-02-01

The implementation and testing of XTALOPT, an evolutionary algorithm for crystal structure prediction, is outlined. We present our new periodic displacement (ripple) operator which is ideally suited to extended systems. It is demonstrated that hybrid operators, which combine two pure operators, reduce the number of duplicate structures in the search. This allows for better exploration of the potential energy surface of the system in question, while simultaneously zooming in on the most promising regions. A continuous workflow, which makes better use of computational resources as compared to traditional generation based algorithms, is employed. Various parameters in XTALOPT are optimized using a novel benchmarking scheme. XTALOPT is available under the GNU Public License, has been interfaced with various codes commonly used to study extended systems, and has an easy to use, intuitive graphical interface. Program summaryProgram title:XTALOPT Catalogue identifier: AEGX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v2.1 or later [1] No. of lines in distributed program, including test data, etc.: 36 849 No. of bytes in distributed program, including test data, etc.: 1 149 399 Distribution format: tar.gz Programming language: C++ Computer: PCs, workstations, or clusters Operating system: Linux Classification: 7.7 External routines: QT [2], OpenBabel [3], AVOGADRO [4], SPGLIB [8] and one of: VASP [5], PWSCF [6], GULP [7]. Nature of problem: Predicting the crystal structure of a system from its stoichiometry alone remains a grand challenge in computational materials science, chemistry, and physics. Solution method: Evolutionary algorithms are stochastic search techniques which use concepts from biological evolution in order to locate the global minimum on their potential energy surface. Our evolutionary algorithm, XTALOPT, is freely
Prediction of non-canonical polyadenylation signals in human genomic sequences based on a novel algorithm using a fuzzy membership function.

Science.gov (United States)

Kamasawa, Masami; Horiuchi, Jun-Ichi

2009-05-01

Computational prediction of polyadenylation signals (PASes) is essential for analysis of alternative polyadenylation that plays crucial roles in gene regulations by generating heterogeneity of 3'-UTR of mRNAs. To date, several algorithms that are mostly based on machine learning methods have been developed to predict PASes. Accuracies of predictions by those algorithms have improved significantly for the last decade. However, they are designed primarily for prediction of the most canonical AAUAAA and its common variant AUUAAA whereas other variants have been ignored in their predictions despite recent studies indicating that non-canonical variants of AAUAAA are more important in the polyadenylation process than commonly recognized. Here we present a new algorithm "PolyF" employing fuzzy logic to confer an advance in computational PAS prediction--enable prediction of the non-canonical variants, and improve the accuracies for the canonical A(A/U)UAAA prediction. PolyF is a simple computational algorithm that is composed of membership functions defining sequence features of downstream sequence element (DSE) and upstream sequence element (USE), together with an inference engine. As a result, PolyF successfully identified the 10 single-nucleotide variants with approximately the same or higher accuracies compared to those for A(A/U)UAAA. PolyF also achieved higher accuracies for A(A/U)UAAA prediction than those by commonly known PAS finder programs, Polyadq and Erpin. Incorporating the USE into the PolyF algorithm was found to enhance prediction accuracies for all the 12 PAS hexamers compared to those using only the DSE, suggesting an important contribution of the USE in the polyadenylation process.
Hazard Forecasting by MRI: A Prediction Algorithm of the First Kind

Science.gov (United States)

Lomnitz, C.

2003-12-01

Seismic gaps do not tell us when and where the next earthquake is due. We present new results on limited earthquake hazard prediction at plate boundaries. Our algorithm quantifies earthquake hazard in seismic gaps. The prediction window found for M7 is on the order of 50 km by 20 years (Lomnitz, 1996a). The earth is unstable with respect to small perturbations of the initial conditions. A prediction of the first kind is an estimate of the time evolution of a complex system with fixed boundary conditions in response to changes in the initial state, for example, weather prediction (Edward Lorenz, 1975; Hasselmann, 2002). We use the catalog of large world earthquakes as a proxy for the initial conditions. The MRI algorithm simulates the response of the system to updating the catalog. After a local stress transient dP the entropy decays as (grad dP)2 due to transient flows directed toward the epicenter. Healing is the thermodynamic process which resets the state of stress. It proceeds as a power law from the rupture boundary inwards, as in a wound. The half-life of a rupture is defined as the healing time which shrinks the size of a scar by half. Healed segments of plate boundary can rupture again. From observations in Chile, Mexico and Japan we find that the half-life of a seismic rupture is about 20 years, in agreement with seismic gap observations. The moment ratio MR is defined as the contrast between the cumulative regional moment release and the local moment deficiency at time t along the plate boundary. The procedure is called MRI. The findings: (1) MRI works; (2) major earthquakes match prominent peaks in the MRI graph; (3) important events (Central Chile 1985; Mexico 1985; Kobe 1995) match MRI peaks which began to emerge 10 to 20 years before the earthquake; (4) The emergence of peaks in MRI depends on earlier ruptures that occurred, not adjacent to but at 10 to 20 fault lengths from the epicentral region, in agreement with triggering effects. The hazard
Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

Science.gov (United States)

Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.

2017-02-01

We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

International Nuclear Information System (INIS)

Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.; Sugiura, K.

2017-01-01

We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

Energy Technology Data Exchange (ETDEWEB)

Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M. [Applied Electromagnetic Research Institute, National Institute of Information and Communications Technology, 4-2-1, Nukui-Kitamachi, Koganei, Tokyo 184-8795 (Japan); Sugiura, K., E-mail: nishizuka.naoto@nict.go.jp [Advanced Speech Translation Research and Development Promotion Center, National Institute of Information and Communications Technology (Japan)

2017-02-01

We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
Slow Learner Prediction Using Multi-Variate Naïve Bayes Classification Algorithm

Directory of Open Access Journals (Sweden)

Shiwani Rana

2017-01-01

Full Text Available Machine Learning is a field of computer science that learns from data by studying algorithms and their constructions. In machine learning, for specific inputs, algorithms help to make predictions. Classification is a supervised learning approach, which maps a data item into predefined classes. For predicting slow learners in an institute, a modified Naïve Bayes algorithm implemented. The implementation is carried sing Python. It takes into account a combination of likewise multi-valued attributes. A dataset of the 60 students of BE (Information Technology Third Semester for the subject of Digital Electronics of University Institute of Engineering and Technology (UIET, Panjab University (PU, Chandigarh, India is taken to carry out the simulations. The analysis is done by choosing most significant forty-eight attributes. The experimental results have shown that the modified Naïve Bayes model has outperformed the Naïve Bayes Classifier in accuracy but requires significant improvement in the terms of elapsed time. By using Modified Naïve Bayes approach, the accuracy is found out to be 71.66% whereas it is calculated 66.66% using existing Naïve Bayes model. Further, a comparison is drawn by using WEKA tool. Here, an accuracy of Naïve Bayes is obtained as 58.33 %.
Exploring the significance of human mobility patterns in social link prediction

KAUST Repository

Alharbi, Basma Mohammed; Zhang, Xiangliang

2014-01-01

Link prediction is a fundamental task in social networks. Recently, emphasis has been placed on forecasting new social ties using user mobility patterns, e.g., investigating physical and semantic co-locations for new proximity measure. This paper
Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III.

Science.gov (United States)

Boon, K H; Khalil-Hani, M; Malarvili, M B

2018-01-01

This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.
Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

Science.gov (United States)

Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

2017-07-01

Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a
Genomic risk prediction of aromatase inhibitor-related arthralgia in patients with breast cancer using a novel machine-learning algorithm.

Science.gov (United States)

Reinbolt, Raquel E; Sonis, Stephen; Timmers, Cynthia D; Fernández-Martínez, Juan Luis; Cernea, Ana; de Andrés-Galiana, Enrique J; Hashemi, Sepehr; Miller, Karin; Pilarski, Robert; Lustberg, Maryam B

2018-01-01

Many breast cancer (BC) patients treated with aromatase inhibitors (AIs) develop aromatase inhibitor-related arthralgia (AIA). Candidate gene studies to identify AIA risk are limited in scope. We evaluated the potential of a novel analytic algorithm (NAA) to predict AIA using germline single nucleotide polymorphisms (SNP) data obtained before treatment initiation. Systematic chart review of 700 AI-treated patients with stage I-III BC identified asymptomatic patients (n = 39) and those with clinically significant AIA resulting in AI termination or therapy switch (n = 123). Germline DNA was obtained and SNP genotyping performed using the Affymetrix UK BioBank Axiom Array to yield 695,277 SNPs. SNP clusters that most closely defined AIA risk were discovered using an NAA that sequentially combined statistical filtering and a machine-learning algorithm. NCBI PhenGenI and Ensemble databases defined gene attribution of the most discriminating SNPs. Phenotype, pathway, and ontologic analyses assessed functional and mechanistic validity. Demographics were similar in cases and controls. A cluster of 70 SNPs, correlating to 57 genes, was identified. This SNP group predicted AIA occurrence with a maximum accuracy of 75.93%. Strong associations with arthralgia, breast cancer, and estrogen phenotypes were seen in 19/57 genes (33%) and were functionally consistent. Using a NAA, we identified a 70 SNP cluster that predicted AIA risk with fair accuracy. Phenotype, functional, and pathway analysis of attributed genes was consistent with clinical phenotypes. This study is the first to link a specific SNP/gene cluster to AIA risk independent of candidate gene bias. © 2017 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Predicting the onset of hazardous alcohol drinking in primary care: development and validation of a simple risk algorithm.

Science.gov (United States)

Bellón, Juan Ángel; de Dios Luna, Juan; King, Michael; Nazareth, Irwin; Motrico, Emma; GildeGómez-Barragán, María Josefa; Torres-González, Francisco; Montón-Franco, Carmen; Sánchez-Celaya, Marta; Díaz-Barreiros, Miguel Ángel; Vicens, Catalina; Moreno-Peral, Patricia

2017-04-01

Little is known about the risk of progressing to hazardous alcohol use in abstinent or low-risk drinkers. To develop and validate a simple brief risk algorithm for the onset of hazardous alcohol drinking (HAD) over 12 months for use in primary care. Prospective cohort study in 32 health centres from six Spanish provinces, with evaluations at baseline, 6 months, and 12 months. Forty-one risk factors were measured and multilevel logistic regression and inverse probability weighting were used to build the risk algorithm. The outcome was new occurrence of HAD during the study, as measured by the AUDIT. From the lists of 174 GPs, 3954 adult abstinent or low-risk drinkers were recruited. The 'predictAL-10' risk algorithm included just nine variables (10 questions): province, sex, age, cigarette consumption, perception of financial strain, having ever received treatment for an alcohol problem, childhood sexual abuse, AUDIT-C, and interaction AUDIT-C*Age. The c-index was 0.886 (95% CI = 0.854 to 0.918). The optimal cutoff had a sensitivity of 0.83 and specificity of 0.80. Excluding childhood sexual abuse from the model (the 'predictAL-9'), the c-index was 0.880 (95% CI = 0.847 to 0.913), sensitivity 0.79, and specificity 0.81. There was no statistically significant difference between the c-indexes of predictAL-10 and predictAL-9. The predictAL-10/9 is a simple and internally valid risk algorithm to predict the onset of hazardous alcohol drinking over 12 months in primary care attendees; it is a brief tool that is potentially useful for primary prevention of hazardous alcohol drinking. © British Journal of General Practice 2017.
TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

Science.gov (United States)

Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

2018-04-11

Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions
A new algorithm predicts pressure and temperature profiles of gas/gas-condensate transmission pipelines

Energy Technology Data Exchange (ETDEWEB)

Mokhatab, Saied [OIEC - Oil Industries' Engineering and Construction Group, Tehran (Iran, Islamic Republic of); Vatani, Ali [University of Tehran (Iran, Islamic Republic of)

2003-07-01

The main objective of the present study has been the development of a relatively simple analytical algorithm for predicting flow temperature and pressure profiles along the two-phase, gas/gas-condensate transmission pipelines. Results demonstrate the ability of the method to predict reasonably accurate pressure gradient and temperature gradient profiles under operating conditions. (author)
A community effort to assess and improve drug sensitivity prediction algorithms.

Science.gov (United States)

Costello, James C; Heiser, Laura M; Georgii, Elisabeth; Gönen, Mehmet; Menden, Michael P; Wang, Nicholas J; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Collins, James J; Gallahan, Dan; Singer, Dinah; Saez-Rodriguez, Julio; Kaski, Samuel; Gray, Joe W; Stolovitzky, Gustavo

2014-12-01

Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodology, Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.
The Hidden Flow Structure and Metric Space of Network Embedding Algorithms Based on Random Walks.

Science.gov (United States)

Gu, Weiwei; Gong, Li; Lou, Xiaodan; Zhang, Jiang

2017-10-13

Network embedding which encodes all vertices in a network as a set of numerical vectors in accordance with it's local and global structures, has drawn widespread attention. Network embedding not only learns significant features of a network, such as the clustering and linking prediction but also learns the latent vector representation of the nodes which provides theoretical support for a variety of applications, such as visualization, link prediction, node classification, and recommendation. As the latest progress of the research, several algorithms based on random walks have been devised. Although those algorithms have drawn much attention for their high scores in learning efficiency and accuracy, there is still a lack of theoretical explanation, and the transparency of those algorithms has been doubted. Here, we propose an approach based on the open-flow network model to reveal the underlying flow structure and its hidden metric space of different random walk strategies on networks. We show that the essence of embedding based on random walks is the latent metric structure defined on the open-flow network. This not only deepens our understanding of random- walk-based embedding algorithms but also helps in finding new potential applications in network embedding.
Crystal structure prediction of flexible molecules using parallel genetic algorithms with a standard force field.

Science.gov (United States)

Kim, Seonah; Orendt, Anita M; Ferraro, Marta B; Facelli, Julio C

2009-10-01

This article describes the application of our distributed computing framework for crystal structure prediction (CSP) the modified genetic algorithms for crystal and cluster prediction (MGAC), to predict the crystal structure of flexible molecules using the general Amber force field (GAFF) and the CHARMM program. The MGAC distributed computing framework includes a series of tightly integrated computer programs for generating the molecule's force field, sampling crystal structures using a distributed parallel genetic algorithm and local energy minimization of the structures followed by the classifying, sorting, and archiving of the most relevant structures. Our results indicate that the method can consistently find the experimentally known crystal structures of flexible molecules, but the number of missing structures and poor ranking observed in some crystals show the need for further improvement of the potential. Copyright 2009 Wiley Periodicals, Inc.
A parallel adaptive mesh refinement algorithm for predicting turbulent non-premixed combusting flows

International Nuclear Information System (INIS)

Gao, X.; Groth, C.P.T.

2005-01-01

A parallel adaptive mesh refinement (AMR) algorithm is proposed for predicting turbulent non-premixed combusting flows characteristic of gas turbine engine combustors. The Favre-averaged Navier-Stokes equations governing mixture and species transport for a reactive mixture of thermally perfect gases in two dimensions, the two transport equations of the κ-ψ turbulence model, and the time-averaged species transport equations, are all solved using a fully coupled finite-volume formulation. A flexible block-based hierarchical data structure is used to maintain the connectivity of the solution blocks in the multi-block mesh and facilitate automatic solution-directed mesh adaptation according to physics-based refinement criteria. This AMR approach allows for anisotropic mesh refinement and the block-based data structure readily permits efficient and scalable implementations of the algorithm on multi-processor architectures. Numerical results for turbulent non-premixed diffusion flames, including cold- and hot-flow predictions for a bluff body burner, are described and compared to available experimental data. The numerical results demonstrate the validity and potential of the parallel AMR approach for predicting complex non-premixed turbulent combusting flows. (author)
Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

Science.gov (United States)

Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

2018-01-01

The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.
Predictive algorithms for early detection of retinopathy of prematurity.

Science.gov (United States)

Piermarocchi, Stefano; Bini, Silvia; Martini, Ferdinando; Berton, Marianna; Lavini, Anna; Gusson, Elena; Marchini, Giorgio; Padovani, Ezio Maria; Macor, Sara; Pignatto, Silvia; Lanzetta, Paolo; Cattarossi, Luigi; Baraldi, Eugenio; Lago, Paola

2017-03-01

To evaluate sensitivity, specificity and the safest cut-offs of three predictive algorithms (WINROP, ROPScore and CHOP ROP) for retinopathy of prematurity (ROP). A retrospective study was conducted in three centres from 2012 to 2014; 445 preterms with gestational age (GA) ≤ 30 weeks and/or birthweight (BW) ≤ 1500 g, and additional unstable cases, were included. No-ROP, mild and type 1 ROP were categorized. The algorithms were analysed for infants with all parameters (GA, BW, weight gain, oxygen therapy, blood transfusion) needed for calculation (399 babies). Retinopathy of prematurity (ROP) was identified in both eyes in 116 patients (26.1%), and 44 (9.9%) had type 1 ROP. Gestational age and BW were significantly lower in ROP group compared with no-ROP subjects (GA: 26.7 ± 2.2 and 30.2 ± 1.9, respectively, p < 0.0001; BW: 839.8 ± 287.0 and 1288.1 ± 321.5 g, respectively, p = 0.0016). Customized alarms of ROPScore and CHOP ROP correctly identified all infants having any ROP or type 1 ROP. WINROP missed 19 cases of ROP, including three type 1 ROP. ROPScore and CHOP ROP provided the best performances with an area under the receiver operating characteristic curve for the detection of severe ROP of 0.93 (95% CI, 0.90-0.96, and 95% CI, 0.89-0.96, respectively), and WINROP obtained 0.83 (95% CI, 0.77-0.87). Median time from alarm to treatment was 11.1, 5.1 and 9.1 weeks, for WINROP, ROPScore and CHOP ROP, respectively. ROPScore and CHOP ROP showed 100% sensitivity to identify sight-threatening ROP. Predictive algorithms are a reliable tool for early identification of infants requiring referral to an ophthalmologist, for reorganizing resources and reducing stressful procedures to preterm babies. © 2016 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

Science.gov (United States)

Kenah, Eben; Britton, Tom; Halloran, M. Elizabeth; Longini, Ira M.

2016-01-01

Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology. PMID:27070316
A causal link between prediction errors, dopamine neurons and learning.

Science.gov (United States)

Steinberg, Elizabeth E; Keiflin, Ronald; Boivin, Josiah R; Witten, Ilana B; Deisseroth, Karl; Janak, Patricia H

2013-07-01

Situations in which rewards are unexpectedly obtained or withheld represent opportunities for new learning. Often, this learning includes identifying cues that predict reward availability. Unexpected rewards strongly activate midbrain dopamine neurons. This phasic signal is proposed to support learning about antecedent cues by signaling discrepancies between actual and expected outcomes, termed a reward prediction error. However, it is unknown whether dopamine neuron prediction error signaling and cue-reward learning are causally linked. To test this hypothesis, we manipulated dopamine neuron activity in rats in two behavioral procedures, associative blocking and extinction, that illustrate the essential function of prediction errors in learning. We observed that optogenetic activation of dopamine neurons concurrent with reward delivery, mimicking a prediction error, was sufficient to cause long-lasting increases in cue-elicited reward-seeking behavior. Our findings establish a causal role for temporally precise dopamine neuron signaling in cue-reward learning, bridging a critical gap between experimental evidence and influential theoretical frameworks.
Training algorithms evaluation for artificial neural network to temporal prediction of photovoltaic generation

International Nuclear Information System (INIS)

Arantes Monteiro, Raul Vitor; Caixeta Guimarães, Geraldo; Rocio Castillo, Madeleine; Matheus Moura, Fabrício Augusto; Tamashiro, Márcio Augusto

2016-01-01

Current energy policies are encouraging the connection of power generation based on low-polluting technologies, mainly those using renewable sources, to distribution networks. Hence, it becomes increasingly important to understand technical challenges, facing high penetration of PV systems at the grid, especially considering the effects of intermittence of this source on the power quality, reliability and stability of the electric distribution system. This fact can affect the distribution networks on which they are attached causing overvoltage, undervoltage and frequency oscillations. In order to predict these disturbs, artificial neural networks are used. This article aims to analyze 3 training algorithms used in artificial neural networks for temporal prediction of the generated active power thru photovoltaic panels. As a result it was concluded that the algorithm with the best performance among the 3 analyzed was the Levenberg-Marquadrt.
A New Tool for CME Arrival Time Prediction using Machine Learning Algorithms: CAT-PUMA

Science.gov (United States)

Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert

2018-03-01

Coronal mass ejections (CMEs) are arguably the most violent eruptions in the solar system. CMEs can cause severe disturbances in interplanetary space and can even affect human activities in many aspects, causing damage to infrastructure and loss of revenue. Fast and accurate prediction of CME arrival time is vital to minimize the disruption that CMEs may cause when interacting with geospace. In this paper, we propose a new approach for partial-/full halo CME Arrival Time Prediction Using Machine learning Algorithms (CAT-PUMA). Via detailed analysis of the CME features and solar-wind parameters, we build a prediction engine taking advantage of 182 previously observed geo-effective partial-/full halo CMEs and using algorithms of the Support Vector Machine. We demonstrate that CAT-PUMA is accurate and fast. In particular, predictions made after applying CAT-PUMA to a test set unknown to the engine show a mean absolute prediction error of ∼5.9 hr within the CME arrival time, with 54% of the predictions having absolute errors less than 5.9 hr. Comparisons with other models reveal that CAT-PUMA has a more accurate prediction for 77% of the events investigated that can be carried out very quickly, i.e., within minutes of providing the necessary input parameters of a CME. A practical guide containing the CAT-PUMA engine and the source code of two examples are available in the Appendix, allowing the community to perform their own applications for prediction using CAT-PUMA.

Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend

Science.gov (United States)

Boonjing, Veera; Intakosum, Sarun

2016-01-01

This study investigated the use of Artificial Neural Network (ANN) and Genetic Algorithm (GA) for prediction of Thailand's SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid's prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span. PMID:27974883
Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend

Directory of Open Access Journals (Sweden)

Montri Inthachot

2016-01-01

Full Text Available This study investigated the use of Artificial Neural Network (ANN and Genetic Algorithm (GA for prediction of Thailand’s SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid’s prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span.
Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

Science.gov (United States)

Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

2018-01-01

T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95
Small hydropower spot prediction using SWAT and a diversion algorithm, case study: Upper Citarum Basin

Science.gov (United States)

Kardhana, Hadi; Arya, Doni Khaira; Hadihardaja, Iwan K.; Widyaningtyas, Riawan, Edi; Lubis, Atika

2017-11-01

Small-Scale Hydropower (SHP) had been important electric energy power source in Indonesia. Indonesia is vast countries, consists of more than 17.000 islands. It has large fresh water resource about 3 m of rainfall and 2 m of runoff. Much of its topography is mountainous, remote but abundant with potential energy. Millions of people do not have sufficient access to electricity, some live in the remote places. Recently, SHP development was encouraged for energy supply of the places. Development of global hydrology data provides opportunity to predict distribution of hydropower potential. In this paper, we demonstrate run-of-river type SHP spot prediction tool using SWAT and a river diversion algorithm. The use of Soil and Water Assessment Tool (SWAT) with input of CFSR (Climate Forecast System Re-analysis) of 10 years period had been implemented to predict spatially distributed flow cumulative distribution function (CDF). A simple algorithm to maximize potential head of a location by a river diversion expressing head race and penstock had been applied. Firm flow and power of the SHP were estimated from the CDF and the algorithm. The tool applied to Upper Citarum River Basin and three out of four existing hydropower locations had been well predicted. The result implies that this tool is able to support acceleration of SHP development at earlier phase.
Condition Monitoring for DC-link Capacitors Based on Artificial Neural Network Algorithm

DEFF Research Database (Denmark)

Soliman, Hammam Abdelaal Hammam; Wang, Huai; Gadalla, Brwene Salah Abdelkarim

2015-01-01

hardware will reduce the cost, and therefore could be more promising for industry applications. A condition monitoring method based on Artificial Neural Network (ANN) algorithm is therefore proposed in this paper. The implementation of the ANN to the DC-link capacitor condition monitoring in a back......In power electronic systems, capacitor is one of the reliability critical components . Recently, the condition monitoring of capacitors to estimate their health status have been attracted by the academic research. Industry applications require more reliable power electronics products...... with preventive maintenance. However, the existing capacitor condition monitoring methods suffer from either increased hardware cost or low estimation accuracy, being the challenges to be adopted in industry applications. New development in condition monitoring technology with software solutions without extra...
Shape: automatic conformation prediction of carbohydrates using a genetic algorithm

Directory of Open Access Journals (Sweden)

Rosen Jimmy

2009-09-01

Full Text Available Abstract Background Detailed experimental three dimensional structures of carbohydrates are often difficult to acquire. Molecular modelling and computational conformation prediction are therefore commonly used tools for three dimensional structure studies. Modelling procedures generally require significant training and computing resources, which is often impractical for most experimental chemists and biologists. Shape has been developed to improve the availability of modelling in this field. Results The Shape software package has been developed for simplicity of use and conformation prediction performance. A trivial user interface coupled to an efficient genetic algorithm conformation search makes it a powerful tool for automated modelling. Carbohydrates up to a few hundred atoms in size can be investigated on common computer hardware. It has been shown to perform well for the prediction of over four hundred bioactive oligosaccharides, as well as compare favourably with previously published studies on carbohydrate conformation prediction. Conclusion The Shape fully automated conformation prediction can be used by scientists who lack significant modelling training, and performs well on computing hardware such as laptops and desktops. It can also be deployed on computer clusters for increased capacity. The prediction accuracy under the default settings is good, as it agrees well with experimental data and previously published conformation prediction studies. This software is available both as open source and under commercial licenses.
Computational investigation of kinetics of cross-linking reactions in proteins: importance in structure prediction.

Science.gov (United States)

Bandyopadhyay, Pradipta; Kuntz, Irwin D

2009-01-01

The determination of protein structure using distance constraints is a new and promising field of study. One implementation involves attaching residues of a protein using a cross-linking agent, followed by protease digestion, analysis of the resulting peptides by mass spectroscopy, and finally sequence threading to detect the protein folds. In the present work, we carry out computational modeling of the kinetics of cross-linking reactions in proteins using the master equation approach. The rate constants of the cross-linking reactions are estimated using the pKas and the solvent-accessible surface areas of the residues involved. This model is tested with fibroblast growth factor (FGF) and cytochrome C. It is consistent with the initial experimental rate data for individual lysine residues for cytochrome C. Our model captures all observed cross-links for FGF and almost 90% of the observed cross-links for cytochrome C, although it also predicts cross-links that were not observed experimentally (false positives). However, the analysis of the false positive results is complicated by the fact that experimental detection of cross-links can be difficult and may depend on specific experimental conditions such as pH, ionic strength. Receiver operator characteristic plots showed that our model does a good job in predicting the observed cross-links. Molecular dynamics simulations showed that for cytochrome C, in general, the two lysines come closer for the observed cross-links as compared to the false positive ones. For FGF, no such clear pattern exists. The kinetic model and MD simulation can be used to study proposed cross-linking protocols.
Decoherence in optimized quantum random-walk search algorithm

International Nuclear Information System (INIS)

Zhang Yu-Chao; Bao Wan-Su; Wang Xiang; Fu Xiang-Qun

2015-01-01

This paper investigates the effects of decoherence generated by broken-link-type noise in the hypercube on an optimized quantum random-walk search algorithm. When the hypercube occurs with random broken links, the optimized quantum random-walk search algorithm with decoherence is depicted through defining the shift operator which includes the possibility of broken links. For a given database size, we obtain the maximum success rate of the algorithm and the required number of iterations through numerical simulations and analysis when the algorithm is in the presence of decoherence. Then the computational complexity of the algorithm with decoherence is obtained. The results show that the ultimate effect of broken-link-type decoherence on the optimized quantum random-walk search algorithm is negative. (paper)
A Homogeneous and Self-Dual Interior-Point Linear Programming Algorithm for Economic Model Predictive Control

DEFF Research Database (Denmark)

Sokoler, Leo Emil; Frison, Gianluca; Skajaa, Anders

2015-01-01

We develop an efficient homogeneous and self-dual interior-point method (IPM) for the linear programs arising in economic model predictive control of constrained linear systems with linear objective functions. The algorithm is based on a Riccati iteration procedure, which is adapted to the linear...... system of equations solved in homogeneous and self-dual IPMs. Fast convergence is further achieved using a warm-start strategy. We implement the algorithm in MATLAB and C. Its performance is tested using a conceptual power management case study. Closed loop simulations show that 1) the proposed algorithm...
Assessing Long-Term Wind Conditions by Combining Different Measure-Correlate-Predict Algorithms: Preprint

Energy Technology Data Exchange (ETDEWEB)

Zhang, J.; Chowdhury, S.; Messac, A.; Hodge, B. M.

2013-08-01

This paper significantly advances the hybrid measure-correlate-predict (MCP) methodology, enabling it to account for variations of both wind speed and direction. The advanced hybrid MCP method uses the recorded data of multiple reference stations to estimate the long-term wind condition at a target wind plant site. The results show that the accuracy of the hybrid MCP method is highly sensitive to the combination of the individual MCP algorithms and reference stations. It was also found that the best combination of MCP algorithms varies based on the length of the correlation period.
An O(n(5)) algorithm for MFE prediction of kissing hairpins and 4-chains in nucleic acids.

Science.gov (United States)

Chen, Ho-Lin; Condon, Anne; Jabbari, Hosna

2009-06-01

Efficient methods for prediction of minimum free energy (MFE) nucleic secondary structures are widely used, both to better understand structure and function of biological RNAs and to design novel nano-structures. Here, we present a new algorithm for MFE secondary structure prediction, which significantly expands the class of structures that can be handled in O(n(5)) time. Our algorithm can handle H-type pseudoknotted structures, kissing hairpins, and chains of four overlapping stems, as well as nested substructures of these types.
Predicting Student Academic Performance: A Comparison of Two Meta-Heuristic Algorithms Inspired by Cuckoo Birds for Training Neural Networks

Directory of Open Access Journals (Sweden)

Jeng-Fung Chen

2014-10-01

Full Text Available Predicting student academic performance with a high accuracy facilitates admission decisions and enhances educational services at educational institutions. This raises the need to propose a model that predicts student performance, based on the results of standardized exams, including university entrance exams, high school graduation exams, and other influential factors. In this study, an approach to the problem based on the artificial neural network (ANN with the two meta-heuristic algorithms inspired by cuckoo birds and their lifestyle, namely, Cuckoo Search (CS and Cuckoo Optimization Algorithm (COA is proposed. In particular, we used previous exam results and other factors, such as the location of the student’s high school and the student’s gender as input variables, and predicted the student academic performance. The standard CS and standard COA were separately utilized to train the feed-forward network for prediction. The algorithms optimized the weights between layers and biases of the neuron network. The simulation results were then discussed and analyzed to investigate the prediction ability of the neural network trained by these two algorithms. The findings demonstrated that both CS and COA have potential in training ANN and ANN-COA obtained slightly better results for predicting student academic performance in this case. It is expected that this work may be used to support student admission procedures and strengthen the service system in educational institutions.
Demonstration of the use of ADAPT to derive predictive maintenance algorithms for the KSC central heat plant

Science.gov (United States)

Hunter, H. E.

1972-01-01

The Avco Data Analysis and Prediction Techniques (ADAPT) were employed to determine laws capable of detecting failures in a heat plant up to three days in advance of the occurrence of the failure. The projected performance of algorithms yielded a detection probability of 90% with false alarm rates of the order of 1 per year for a sample rate of 1 per day with each detection, followed by 3 hourly samplings. This performance was verified on 173 independent test cases. The program also demonstrated diagnostic algorithms and the ability to predict the time of failure to approximately plus or minus 8 hours up to three days in advance of the failure. The ADAPT programs produce simple algorithms which have a unique possibility of a relatively low cost updating procedure. The algorithms were implemented on general purpose computers at Kennedy Space Flight Center and tested against current data.
BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

Science.gov (United States)

Kaur, Harpreet; Raghava, G P S

2002-03-01

beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/
Artificial Fish Swarm Algorithm-Based Particle Filter for Li-Ion Battery Life Prediction

Directory of Open Access Journals (Sweden)

Ye Tian

2014-01-01

Full Text Available An intelligent online prognostic approach is proposed for predicting the remaining useful life (RUL of lithium-ion (Li-ion batteries based on artificial fish swarm algorithm (AFSA and particle filter (PF, which is an integrated approach combining model-based method with data-driven method. The parameters, used in the empirical model which is based on the capacity fade trends of Li-ion batteries, are identified dependent on the tracking ability of PF. AFSA-PF aims to improve the performance of the basic PF. By driving the prior particles to the domain with high likelihood, AFSA-PF allows global optimization, prevents particle degeneracy, thereby improving particle distribution and increasing prediction accuracy and algorithm convergence. Data provided by NASA are used to verify this approach and compare it with basic PF and regularized PF. AFSA-PF is shown to be more accurate and precise.
Artificial Neural Network Algorithm for Condition Monitoring of DC-link Capacitors Based on Capacitance Estimation

DEFF Research Database (Denmark)

Soliman, Hammam Abdelaal Hammam; Wang, Huai; Gadalla, Brwene Salah Abdelkarim

2015-01-01

challenges. A capacitance estimation method based on Artificial Neural Network (ANN) algorithm is therefore proposed in this paper. The implemented ANN estimated the capacitance of the DC-link capacitor in a back-toback converter. Analysis of the error of the capacitance estimation is also given......In power electronic converters, reliability of DC-link capacitors is one of the critical issues. The estimation of their health status as an application of condition monitoring have been an attractive subject for industrial field and hence for the academic research filed as well. More reliable...... solutions are required to be adopted by the industry applications in which usage of extra hardware, increased cost, and low estimation accuracy are the main challenges. Therefore, development of new condition monitoring methods based on software solutions could be the new era that covers the aforementioned...
Capacitance Estimation for DC-link Capacitors in a Back-to-Back Converter Based on Artificial Neural Network Algorithm

DEFF Research Database (Denmark)

Soliman, Hammam Abdelaal Hammam; Wang, Huai; Blaabjerg, Frede

2016-01-01

of the aforementioned challenges and shortcomings. In this paper, a pure software condition monitoring method based on Artificial Neural Network (ANN) algorithm is proposed. The implemented ANN estimates the capacitance of the dc-link capacitor in a back-to-back converter. The error analysis of the estimated results......The reliability of dc-link capacitors in power electronic converters is one of the critical aspects to be considered in modern power converter design. The observation of their ageing process and the estimation of their health status have been an attractive subject for the industrial field and hence...
Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.

Science.gov (United States)

Huang, Cai; Mezencev, Roman; McDonald, John F; Vannberg, Fredrik

2017-01-01

Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM) algorithm combined with a standard recursive feature elimination (RFE) approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60). The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC) patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.
Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.

Directory of Open Access Journals (Sweden)

Cai Huang

Full Text Available Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM algorithm combined with a standard recursive feature elimination (RFE approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60. The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.
Performance prediction of a synchronization link for distributed aerospace wireless systems.

Science.gov (United States)

Wang, Wen-Qin; Shao, Huaizong

2013-01-01

For reasons of stealth and other operational advantages, distributed aerospace wireless systems have received much attention in recent years. In a distributed aerospace wireless system, since the transmitter and receiver placed on separated platforms which use independent master oscillators, there is no cancellation of low-frequency phase noise as in the monostatic cases. Thus, high accurate time and frequency synchronization techniques are required for distributed wireless systems. The use of a dedicated synchronization link to quantify and compensate oscillator frequency instability is investigated in this paper. With the mathematical statistical models of phase noise, closed-form analytic expressions for the synchronization link performance are derived. The possible error contributions including oscillator, phase-locked loop, and receiver noise are quantified. The link synchronization performance is predicted by utilizing the knowledge of the statistical models, system error contributions, and sampling considerations. Simulation results show that effective synchronization error compensation can be achieved by using this dedicated synchronization link.

Developing robust arsenic awareness prediction models using machine learning algorithms.

Science.gov (United States)

Singh, Sushant K; Taylor, Robert W; Rahman, Mohammad Mahmudur; Pradhan, Biswajeet

2018-04-01

Arsenic awareness plays a vital role in ensuring the sustainability of arsenic mitigation technologies. Thus far, however, few studies have dealt with the sustainability of such technologies and its associated socioeconomic dimensions. As a result, arsenic awareness prediction has not yet been fully conceptualized. Accordingly, this study evaluated arsenic awareness among arsenic-affected communities in rural India, using a structured questionnaire to record socioeconomic, demographic, and other sociobehavioral factors with an eye to assessing their association with and influence on arsenic awareness. First a logistic regression model was applied and its results compared with those produced by six state-of-the-art machine-learning algorithms (Support Vector Machine [SVM], Kernel-SVM, Decision Tree [DT], k-Nearest Neighbor [k-NN], Naïve Bayes [NB], and Random Forests [RF]) as measured by their accuracy at predicting arsenic awareness. Most (63%) of the surveyed population was found to be arsenic-aware. Significant arsenic awareness predictors were divided into three types: (1) socioeconomic factors: caste, education level, and occupation; (2) water and sanitation behavior factors: number of family members involved in water collection, distance traveled and time spent for water collection, places for defecation, and materials used for handwashing after defecation; and (3) social capital and trust factors: presence of anganwadi and people's trust in other community members, NGOs, and private agencies. Moreover, individuals' having higher social network positively contributed to arsenic awareness in the communities. Results indicated that both the SVM and the RF algorithms outperformed at overall prediction of arsenic awareness-a nonlinear classification problem. Lower-caste, less educated, and unemployed members of the population were found to be the most vulnerable, requiring immediate arsenic mitigation. To this end, local social institutions and NGOs could play a
Characterization and prediction of the backscattered form function of an immersed cylindrical shell using hybrid fuzzy clustering and bio-inspired algorithms.

Science.gov (United States)

Agounad, Said; Aassif, El Houcein; Khandouch, Younes; Maze, Gérard; Décultot, Dominique

2018-02-01

The acoustic scattering of a plane wave by an elastic cylindrical shell is studied. A new approach is developed to predict the form function of an immersed cylindrical shell of the radius ratio b/a ('b' is the inner radius and 'a' is the outer radius). The prediction of the backscattered form function is investigated by a combined approach between fuzzy clustering algorithms and bio-inspired algorithms. Four famous fuzzy clustering algorithms: the fuzzy c-means (FCM), the Gustafson-Kessel algorithm (GK), the fuzzy c-regression model (FCRM) and the Gath-Geva algorithm (GG) are combined with particle swarm optimization and genetic algorithm. The symmetric and antisymmetric circumferential waves A, S 0 , A 1 , S 1 and S 2 are investigated in a reduced frequency (k 1 a) range extends over 0.1predicted and calculated acoustic backscattered form functions. This representation is used as a comparison criterion between the calculated form function by the analytical method and that predicted by the proposed approach on the one hand and is used to extract the predicted cut-off frequencies on the other hand. Moreover, the transverse velocity of the material constituting the cylindrical shell is extracted. The computational results show that the proposed approach is very efficient to predict the form function and consequently, for acoustic characterization purposes. Copyright © 2017 Elsevier B.V. All rights reserved.
[Prediction of regional soil quality based on mutual information theory integrated with decision tree algorithm].

Science.gov (United States)

Lin, Fen-Fang; Wang, Ke; Yang, Ning; Yan, Shi-Guang; Zheng, Xin-Yu

2012-02-01

In this paper, some main factors such as soil type, land use pattern, lithology type, topography, road, and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality, mutual information theory was adopted to select the main environmental factors, and decision tree algorithm See 5.0 was applied to predict the grade of regional soil quality. The main factors affecting regional soil quality were soil type, land use, lithology type, distance to town, distance to water area, altitude, distance to road, and distance to industrial land. The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables, and, for the former model, whether of decision tree or of decision rule, its prediction accuracy was all higher than 80%. Based on the continuous and categorical data, the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm, but also predict and assess regional soil quality effectively.
NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm.

Science.gov (United States)

Xue, Yu; Chen, Hu; Jin, Changjiang; Sun, Zhirong; Yao, Xuebiao

2006-10-17

Protein palmitoylation, an essential and reversible post-translational modification (PTM), has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed) and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM), also have been employed and compared with NBA-Palm. Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: http://www.bioinfo.tsinghua.edu.cn/NBA-Palm.
Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

Science.gov (United States)

Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

2016-01-01

With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.
Capacitance estimation algorithm based on DC-link voltage harmonics using artificial neural network in three-phase motor drive systems

DEFF Research Database (Denmark)

Soliman, Hammam Abdelaal Hammam; Davari, Pooya; Wang, Huai

2017-01-01

to industry. In this digest, a condition monitoring methodology that estimates the capacitance value of the dc-link capacitor in a three phase Front-End diode bridge motor drive is proposed. The proposed software methodology is based on Artificial Neural Network (ANN) algorithm. The harmonics of the dc......-link voltage are used as training data to the Artificial Neural Network. Fast Fourier Transform (FFT) of the dc-link voltage is analysed in order to study the impact of capacitance variation on the harmonics order. Laboratory experiments are conducted to validate the proposed methodology and the error analysis......In modern design of power electronic converters, reliability of dc-link capacitors is one of the critical considered aspects. The industrial field have been attracted to the monitoring of their health condition and the estimation of their ageing process status. However, the existing condition...
TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers

Science.gov (United States)

Cao, Han; Ng, Marcus C. K.; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W. I.

2017-09-01

α-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD PHP, MySQL and Apache, with all major browsers supported.
REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Directory of Open Access Journals (Sweden)

Carmen Brando

2016-07-01

Full Text Available This paper proposes a graph-based Named Entity Linking (NEL algorithm named REDEN for the disambiguation of authors’ names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs, and, finally, fuses graphs of homologous individuals into a non-redundant graph well-suited for graph centrality calculation; the resulting graph is used for choosing the best referent. The REDEN algorithm is distributed in open-source and follows current standards in digital editions (TEI and semantic Web (RDF. Its integration into an editorial workflow of digital editions in Digital humanities and cultural heritage projects is entirely plausible. Experiments are conducted along with the corresponding error analysis in order to test our approach and to help us to study the weaknesses and strengths of our algorithm, thereby to further improvements of REDEN.
An administrative data validation study of the accuracy of algorithms for identifying rheumatoid arthritis: the influence of the reference standard on algorithm performance.

Science.gov (United States)

Widdifield, Jessica; Bombardier, Claire; Bernatsky, Sasha; Paterson, J Michael; Green, Diane; Young, Jacqueline; Ivers, Noah; Butt, Debra A; Jaakkimainen, R Liisa; Thorne, J Carter; Tu, Karen

2014-06-23

We have previously validated administrative data algorithms to identify patients with rheumatoid arthritis (RA) using rheumatology clinic records as the reference standard. Here we reassessed the accuracy of the algorithms using primary care records as the reference standard. We performed a retrospective chart abstraction study using a random sample of 7500 adult patients under the care of 83 family physicians contributing to the Electronic Medical Record Administrative data Linked Database (EMRALD) in Ontario, Canada. Using physician-reported diagnoses as the reference standard, we computed and compared the sensitivity, specificity, and predictive values for over 100 administrative data algorithms for RA case ascertainment. We identified 69 patients with RA for a lifetime RA prevalence of 0.9%. All algorithms had excellent specificity (>97%). However, sensitivity varied (75-90%) among physician billing algorithms. Despite the low prevalence of RA, most algorithms had adequate positive predictive value (PPV; 51-83%). The algorithm of "[1 hospitalization RA diagnosis code] or [3 physician RA diagnosis codes with ≥1 by a specialist over 2 years]" had a sensitivity of 78% (95% CI 69-88), specificity of 100% (95% CI 100-100), PPV of 78% (95% CI 69-88) and NPV of 100% (95% CI 100-100). Administrative data algorithms for detecting RA patients achieved a high degree of accuracy amongst the general population. However, results varied slightly from our previous report, which can be attributed to differences in the reference standards with respect to disease prevalence, spectrum of disease, and type of comparator group.
ANFIS Based Time Series Prediction Method of Bank Cash Flow Optimized by Adaptive Population Activity PSO Algorithm

Directory of Open Access Journals (Sweden)

Jie-Sheng Wang

2015-06-01

Full Text Available In order to improve the accuracy and real-time of all kinds of information in the cash business, and solve the problem which accuracy and stability is not high of the data linkage between cash inventory forecasting and cash management information in the commercial bank, a hybrid learning algorithm is proposed based on adaptive population activity particle swarm optimization (APAPSO algorithm combined with the least squares method (LMS to optimize the adaptive network-based fuzzy inference system (ANFIS model parameters. Through the introduction of metric function of population diversity to ensure the diversity of population and adaptive changes in inertia weight and learning factors, the optimization ability of the particle swarm optimization (PSO algorithm is improved, which avoids the premature convergence problem of the PSO algorithm. The simulation comparison experiments are carried out with BP-LMS algorithm and standard PSO-LMS by adopting real commercial banks’ cash flow data to verify the effectiveness of the proposed time series prediction of bank cash flow based on improved PSO-ANFIS optimization method. Simulation results show that the optimization speed is faster and the prediction accuracy is higher.
Development and verification of an analytical algorithm to predict absorbed dose distributions in ocular proton therapy using Monte Carlo simulations

International Nuclear Information System (INIS)

Koch, Nicholas C; Newhauser, Wayne D

2010-01-01

Proton beam radiotherapy is an effective and non-invasive treatment for uveal melanoma. Recent research efforts have focused on improving the dosimetric accuracy of treatment planning and overcoming the present limitation of relative analytical dose calculations. Monte Carlo algorithms have been shown to accurately predict dose per monitor unit (D/MU) values, but this has yet to be shown for analytical algorithms dedicated to ocular proton therapy, which are typically less computationally expensive than Monte Carlo algorithms. The objective of this study was to determine if an analytical method could predict absolute dose distributions and D/MU values for a variety of treatment fields like those used in ocular proton therapy. To accomplish this objective, we used a previously validated Monte Carlo model of an ocular nozzle to develop an analytical algorithm to predict three-dimensional distributions of D/MU values from pristine Bragg peaks and therapeutically useful spread-out Bragg peaks (SOBPs). Results demonstrated generally good agreement between the analytical and Monte Carlo absolute dose calculations. While agreement in the proximal region decreased for beams with less penetrating Bragg peaks compared with the open-beam condition, the difference was shown to be largely attributable to edge-scattered protons. A method for including this effect in any future analytical algorithm was proposed. Comparisons of D/MU values showed typical agreement to within 0.5%. We conclude that analytical algorithms can be employed to accurately predict absolute proton dose distributions delivered by an ocular nozzle.
NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm

Directory of Open Access Journals (Sweden)

Jin Changjiang

2006-10-01

Full Text Available Abstract Background Protein palmitoylation, an essential and reversible post-translational modification (PTM, has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. Results In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM, also have been employed and compared with NBA-Palm. Conclusion Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: http://www.bioinfo.tsinghua.edu.cn/NBA-Palm.
Algorithm aversion: people erroneously avoid algorithms after seeing them err.

Science.gov (United States)

Dietvorst, Berkeley J; Simmons, Joseph P; Massey, Cade

2015-02-01

Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.
Parametric study on the advantages of weather-predicted control algorithm of free cooling ventilation system

International Nuclear Information System (INIS)

Medved, Sašo; Babnik, Miha; Vidrih, Boris; Arkar, Ciril

2014-01-01

Predicted climate changes and the increased intensity of urban heat islands, as well as population aging, will increase the energy demand for the cooling of buildings in the future. However, the energy demand for cooling can be efficiently reduced by low-exergy free-cooling systems, which use natural processes, like evaporative cooling or the environmental cold of ambient air during night-time ventilation for the cooling of buildings. Unlike mechanical cooling systems, the energy for the operation of free-cooling system is needed only for the transport of the cold from the environment into the building. Because the natural cold potential is time dependent, the efficiency of free-cooling systems could be improved by introducing a weather forecast into the algorithm for the controlling. In the article, a numerical algorithm for the optimization of the operation of free-cooling systems with night-time ventilation is presented and validated on a test cell with different thermal storage capacities and during different ambient conditions. As a case study, the advantage of weather-predicted controlling is presented for a summer week for typical office room. The results show the necessity of the weather-predicted controlling of free-cooling ventilation systems for achieving the highest overall energy efficiency of such systems in comparison to mechanical cooling, better indoor comfort conditions and a decrease in the primary energy needed for cooling of the buildings. - Highlights: • Energy demand for cooling will increase due to climate changes and urban heat island • Free cooling could significantly reduce energy demand for cooling of the buildings. • Free cooling is more effective if weather prediction is included in operation control. • Weather predicted free cooling operation algorithm was validated on test cell. • Advantages of free-cooling on mechanical cooling is shown with different indicators
Long-term prediction of chaotic time series with multi-step prediction horizons by a neural network with Levenberg-Marquardt learning algorithm

International Nuclear Information System (INIS)

Mirzaee, Hossein

2009-01-01

The Levenberg-Marquardt learning algorithm is applied for training a multilayer perception with three hidden layer each with ten neurons in order to carefully map the structure of chaotic time series such as Mackey-Glass time series. First the MLP network is trained with 1000 data, and then it is tested with next 500 data. After that the trained and tested network is applied for long-term prediction of next 120 data which come after test data. The prediction is such a way that, the first inputs to network for prediction are the four last data of test data, then the predicted value is shifted to the regression vector which is the input to the network, then after first four-step of prediction, the input regression vector to network is fully predicted values and in continue, each predicted data is shifted to input vector for subsequent prediction.
Screw Remaining Life Prediction Based on Quantum Genetic Algorithm and Support Vector Machine

Directory of Open Access Journals (Sweden)

Xiaochen Zhang

2017-01-01

Full Text Available To predict the remaining life of ball screw, a screw remaining life prediction method based on quantum genetic algorithm (QGA and support vector machine (SVM is proposed. A screw accelerated test bench is introduced. Accelerometers are installed to monitor the performance degradation of ball screw. Combined with wavelet packet decomposition and isometric mapping (Isomap, the sensitive feature vectors are obtained and stored in database. Meanwhile, the sensitive feature vectors are randomly chosen from the database and constitute training samples and testing samples. Then the optimal kernel function parameter and penalty factor of SVM are searched with the method of QGA. Finally, the training samples are used to train optimized SVM while testing samples are adopted to test the prediction accuracy of the trained SVM so the screw remaining life prediction model can be got. The experiment results show that the screw remaining life prediction model could effectively predict screw remaining life.
Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

Science.gov (United States)

Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

2007-01-01

To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.
Prediction of insemination outcomes in Holstein dairy cattle using alternative machine learning algorithms.

Science.gov (United States)

Shahinfar, Saleh; Page, David; Guenther, Jerry; Cabrera, Victor; Fricke, Paul; Weigel, Kent

2014-02-01

When making the decision about whether or not to breed a given cow, knowledge about the expected outcome would have an economic impact on profitability of the breeding program and net income of the farm. The outcome of each breeding can be affected by many management and physiological features that vary between farms and interact with each other. Hence, the ability of machine learning algorithms to accommodate complex relationships in the data and missing values for explanatory variables makes these algorithms well suited for investigation of reproduction performance in dairy cattle. The objective of this study was to develop a user-friendly and intuitive on-farm tool to help farmers make reproduction management decisions. Several different machine learning algorithms were applied to predict the insemination outcomes of individual cows based on phenotypic and genotypic data. Data from 26 dairy farms in the Alta Genetics (Watertown, WI) Advantage Progeny Testing Program were used, representing a 10-yr period from 2000 to 2010. Health, reproduction, and production data were extracted from on-farm dairy management software, and estimated breeding values were downloaded from the US Department of Agriculture Agricultural Research Service Animal Improvement Programs Laboratory (Beltsville, MD) database. The edited data set consisted of 129,245 breeding records from primiparous Holstein cows and 195,128 breeding records from multiparous Holstein cows. Each data point in the final data set included 23 and 25 explanatory variables and 1 binary outcome for of 0.756 ± 0.005 and 0.736 ± 0.005 for primiparous and multiparous cows, respectively. The naïve Bayes algorithm, Bayesian network, and decision tree algorithms showed somewhat poorer classification performance. An information-based variable selection procedure identified herd average conception rate, incidence of ketosis, number of previous (failed) inseminations, days in milk at breeding, and mastitis as the most
Algorithmic paranoia and the convivial alternative

Directory of Open Access Journals (Sweden)

Dan McQuillan

2016-11-01

Full Text Available In a time of big data, thinking about how we are seen and how that affects our lives means changing our idea about who does the seeing. Data produced by machines is most often ‘seen’ by other machines; the eye is in question is algorithmic. Algorithmic seeing does not produce a computational panopticon but a mechanism of prediction. The authority of its predictions rests on a slippage of the scientific method in to the world of data. Data science inherits some of the problems of science, especially the disembodied ‘view from above’, and adds new ones of its own. As its core methods like machine learning are based on seeing correlations not understanding causation, it reproduces the prejudices of its input. Rising in to the apparatuses of governance, it reinforces the problematic sides of ‘seeing like a state’ and links to the recursive production of paranoia. It forces us to ask the question ‘what counts as rational seeing?’. Answering this from a position of feminist empiricism reveals different possibilities latent in seeing with machines. Grounded in the idea of conviviality, machine learning may reveal forgotten non-market patterns and enable free and critical learning. It is proposed that a programme to challenge the production of irrational pre-emption is also a search for the possibility of algorithmic conviviality.
NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction

Directory of Open Access Journals (Sweden)

Lund Ole

2009-09-01

Full Text Available Abstract Background The major histocompatibility complex (MHC molecule plays a central role in controlling the adaptive immune response to infections. MHC class I molecules present peptides derived from intracellular proteins to cytotoxic T cells, whereas MHC class II molecules stimulate cellular and humoral immunity through presentation of extracellularly derived peptides to helper T cells. Identification of which peptides will bind a given MHC molecule is thus of great importance for the understanding of host-pathogen interactions, and large efforts have been placed in developing algorithms capable of predicting this binding event. Results Here, we present a novel artificial neural network-based method, NN-align that allows for simultaneous identification of the MHC class II binding core and binding affinity. NN-align is trained using a novel training algorithm that allows for correction of bias in the training data due to redundant binding core representation. Incorporation of information about the residues flanking the peptide-binding core is shown to significantly improve the prediction accuracy. The method is evaluated on a large-scale benchmark consisting of six independent data sets covering 14 human MHC class II alleles, and is demonstrated to outperform other state-of-the-art MHC class II prediction methods. Conclusion The NN-align method is competitive with the state-of-the-art MHC class II peptide binding prediction algorithms. The method is publicly available at http://www.cbs.dtu.dk/services/NetMHCII-2.0.

Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

Directory of Open Access Journals (Sweden)

Wu Chi-Yeh

2010-01-01

Full Text Available Abstract Background MicroRNAs (miRNAs are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention. Results This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set. Conclusion Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G
Advanced Emergency Braking Control Based on a Nonlinear Model Predictive Algorithm for Intelligent Vehicles

Directory of Open Access Journals (Sweden)

Ronghui Zhang

2017-05-01

Full Text Available Focusing on safety, comfort and with an overall aim of the comprehensive improvement of a vision-based intelligent vehicle, a novel Advanced Emergency Braking System (AEBS is proposed based on Nonlinear Model Predictive Algorithm. Considering the nonlinearities of vehicle dynamics, a vision-based longitudinal vehicle dynamics model is established. On account of the nonlinear coupling characteristics of the driver, surroundings, and vehicle itself, a hierarchical control structure is proposed to decouple and coordinate the system. To avoid or reduce the collision risk between the intelligent vehicle and collision objects, a coordinated cost function of tracking safety, comfort, and fuel economy is formulated. Based on the terminal constraints of stable tracking, a multi-objective optimization controller is proposed using the theory of non-linear model predictive control. To quickly and precisely track control target in a finite time, an electronic brake controller for AEBS is designed based on the Nonsingular Fast Terminal Sliding Mode (NFTSM control theory. To validate the performance and advantages of the proposed algorithm, simulations are implemented. According to the simulation results, the proposed algorithm has better integrated performance in reducing the collision risk and improving the driving comfort and fuel economy of the smart car compared with the existing single AEBS.
Algorithm of dynamic stabilization system for a car 4x4 with a link rear axle

Directory of Open Access Journals (Sweden)

M. M. Jileikin

2014-01-01

Full Text Available The slow development of active safety systems of the automobile all-wheel drive vehicles is the cause of lack of researches in the field of power distribution under the specific conditions of movement. The purpose of work is to develop methods to control a curvilinear motion of 4x4 cars with a link to the rear axle that provides the increase in directional and trajectory stability of the car. The paper analyses the known methods to increase wheeled vehicles movement stability. It also offers a method for power flow redistribution in the transmission of the car 4x4 with a link to the rear axle, providing the increase in directional and trajectory stability of the car.To study the performance and effectiveness of the proposed method a mathematical model of the moving car 4x4 with a link to the rear axle is developed. Simulation methods allowed us to establish the following:1. for car 4x4 with redistribution of torque between the driving axles in the range of 100:0 - 50:50 and with redistribution of torque between the wheels of the rear axle in the range of 0:100 the most effective are the stabilization algorithms used in combination “Lowing power consumption of the engine +Creation of stabilizing the moment due to the redistribution of torque on different wheels", providing the increase in directional and trajectory stability by 12...93%;2. for car 4x4 with redistribution of torque between the driving axles in the range 100:0 - 0:100 and with redistribution of torque between the wheels of the rear axle in the range of 0:100 the best option is a combination of algorithms "Lowing power consumption of the engine + Creation of stabilizing moment due to redistribution of torques on different wheels", providing the increase in directional and trajectory stability by 27...93%.A comparative analysis of algorithms efficiency of dynamic stabilization system operation for two-axle wheeled vehicles depending on the torque redistribution between the driving
Ternary alloy material prediction using genetic algorithm and cluster expansion

Energy Technology Data Exchange (ETDEWEB)

Chen, Chong [Iowa State Univ., Ames, IA (United States)

2015-12-01

This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we did our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe₃VSi₂ is a new stable phase and it can be very inspiring to the future experiments.
Fast Quantum Algorithm for Predicting Descriptive Statistics of Stochastic Processes

Science.gov (United States)

Williams Colin P.

1999-01-01

Stochastic processes are used as a modeling tool in several sub-fields of physics, biology, and finance. Analytic understanding of the long term behavior of such processes is only tractable for very simple types of stochastic processes such as Markovian processes. However, in real world applications more complex stochastic processes often arise. In physics, the complicating factor might be nonlinearities; in biology it might be memory effects; and in finance is might be the non-random intentional behavior of participants in a market. In the absence of analytic insight, one is forced to understand these more complex stochastic processes via numerical simulation techniques. In this paper we present a quantum algorithm for performing such simulations. In particular, we show how a quantum algorithm can predict arbitrary descriptive statistics (moments) of N-step stochastic processes in just O(square root of N) time. That is, the quantum complexity is the square root of the classical complexity for performing such simulations. This is a significant speedup in comparison to the current state of the art.
Development and validation of a prediction algorithm for the onset of common mental disorders in a working population.

Science.gov (United States)

Fernandez, Ana; Salvador-Carulla, Luis; Choi, Isabella; Calvo, Rafael; Harvey, Samuel B; Glozier, Nicholas

2018-01-01

Common mental disorders are the most common reason for long-term sickness absence in most developed countries. Prediction algorithms for the onset of common mental disorders may help target indicated work-based prevention interventions. We aimed to develop and validate a risk algorithm to predict the onset of common mental disorders at 12 months in a working population. We conducted a secondary analysis of the Household, Income and Labour Dynamics in Australia Survey, a longitudinal, nationally representative household panel in Australia. Data from the 6189 working participants who did not meet the criteria for a common mental disorders at baseline were non-randomly split into training and validation databases, based on state of residence. Common mental disorders were assessed with the mental component score of 36-Item Short Form Health Survey questionnaire (score ⩽45). Risk algorithms were constructed following recommendations made by the Transparent Reporting of a multivariable prediction model for Prevention Or Diagnosis statement. Different risk factors were identified among women and men for the final risk algorithms. In the training data, the model for women had a C-index of 0.73 and effect size (Hedges' g) of 0.91. In men, the C-index was 0.76 and the effect size was 1.06. In the validation data, the C-index was 0.66 for women and 0.73 for men, with positive predictive values of 0.28 and 0.26, respectively Conclusion: It is possible to develop an algorithm with good discrimination for the onset identifying overall and modifiable risks of common mental disorders among working men. Such models have the potential to change the way that prevention of common mental disorders at the workplace is conducted, but different models may be required for women.
A novel gene network inference algorithm using predictive minimum description length approach.

Science.gov (United States)

Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

2010-05-28

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the
A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.

Directory of Open Access Journals (Sweden)

Daniel M de Brito

Full Text Available Genomic Islands (GIs are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me.
Demonstration of Linked UAV Observations and Atmospheric Model Predictions in Chem/Bio Attack Response

National Research Council Canada - National Science Library

Davidson, Kenneth

2003-01-01

... meteorological data, and the means for linking the UAV data to real-time dispersion prediction. The primary modeling effort focused on an adaptation of the 'Wind On Constant Streamline Surfaces...
Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.

Science.gov (United States)

Jiménez, Fernando; Sánchez, Gracia; Juárez, José M

2014-03-01

This paper presents a novel rule-based fuzzy classification methodology for survival/mortality prediction in severe burnt patients. Due to the ethical aspects involved in this medical scenario, physicians tend not to accept a computer-based evaluation unless they understand why and how such a recommendation is given. Therefore, any fuzzy classifier model must be both accurate and interpretable. The proposed methodology is a three-step process: (1) multi-objective constrained optimization of a patient's data set, using Pareto-based elitist multi-objective evolutionary algorithms to maximize accuracy and minimize the complexity (number of rules) of classifiers, subject to interpretability constraints; this step produces a set of alternative (Pareto) classifiers; (2) linguistic labeling, which assigns a linguistic label to each fuzzy set of the classifiers; this step is essential to the interpretability of the classifiers; (3) decision making, whereby a classifier is chosen, if it is satisfactory, according to the preferences of the decision maker. If no classifier is satisfactory for the decision maker, the process starts again in step (1) with a different input parameter set. The performance of three multi-objective evolutionary algorithms, niched pre-selection multi-objective algorithm, elitist Pareto-based multi-objective evolutionary algorithm for diversity reinforcement (ENORA) and the non-dominated sorting genetic algorithm (NSGA-II), was tested using a patient's data set from an intensive care burn unit and a standard machine learning data set from an standard machine learning repository. The results are compared using the hypervolume multi-objective metric. Besides, the results have been compared with other non-evolutionary techniques and validated with a multi-objective cross-validation technique. Our proposal improves the classification rate obtained by other non-evolutionary techniques (decision trees, artificial neural networks, Naive Bayes, and case
An Evaluation of Algorithms for Identifying Metastatic Breast, Lung, or Colorectal Cancer in Administrative Claims Data.

Science.gov (United States)

Whyte, Joanna L; Engel-Nitz, Nicole M; Teitelbaum, April; Gomez Rey, Gabriel; Kallich, Joel D

2015-07-01

Administrative health care claims data are used for epidemiologic, health services, and outcomes cancer research and thus play a significant role in policy. Cancer stage, which is often a major driver of cost and clinical outcomes, is not typically included in claims data. Evaluate algorithms used in a dataset of cancer patients to identify patients with metastatic breast (BC), lung (LC), or colorectal (CRC) cancer using claims data. Clinical data on BC, LC, or CRC patients (between January 1, 2007 and March 31, 2010) were linked to a health care claims database. Inclusion required health plan enrollment ≥3 months before initial cancer diagnosis date. Algorithms were used in the claims database to identify patients' disease status, which was compared with physician-reported metastases. Generic and tumor-specific algorithms were evaluated using ICD-9 codes, varying diagnosis time frames, and including/excluding other tumors. Positive and negative predictive values, sensitivity, and specificity were assessed. The linked databases included 14,480 patients; of whom, 32%, 17%, and 14.2% had metastatic BC, LC, and CRC, respectively, at diagnosis and met inclusion criteria. Nontumor-specific algorithms had lower specificity than tumor-specific algorithms. Tumor-specific algorithms' sensitivity and specificity were 53% and 99% for BC, 55% and 85% for LC, and 59% and 98% for CRC, respectively. Algorithms to distinguish metastatic BC, LC, and CRC from locally advanced disease should use tumor-specific primary cancer codes with 2 claims for the specific primary cancer >30-42 days apart to reduce misclassification. These performed best overall in specificity, positive predictive values, and overall accuracy to identify metastatic cancer in a health care claims database.
Application of Genetic Algorithm to Predict Optimal Sowing Region and Timing for Kentucky Bluegrass in China.

Directory of Open Access Journals (Sweden)

Erxu Pi

Full Text Available Temperature is a predominant environmental factor affecting grass germination and distribution. Various thermal-germination models for prediction of grass seed germination have been reported, in which the relationship between temperature and germination were defined with kernel functions, such as quadratic or quintic function. However, their prediction accuracies warrant further improvements. The purpose of this study is to evaluate the relative prediction accuracies of genetic algorithm (GA models, which are automatically parameterized with observed germination data. The seeds of five P. pratensis (Kentucky bluegrass, KB cultivars were germinated under 36 day/night temperature regimes ranging from 5/5 to 40/40 °C with 5 °C increments. Results showed that optimal germination percentages of all five tested KB cultivars were observed under a fluctuating temperature regime of 20/25 °C. Meanwhile, the constant temperature regimes (e.g., 5/5, 10/10, 15/15 °C, etc. suppressed the germination of all five cultivars. Furthermore, the back propagation artificial neural network (BP-ANN algorithm was integrated to optimize temperature-germination response models from these observed germination data. It was found that integrations of GA-BP-ANN (back propagation aided genetic algorithm artificial neural network significantly reduced the Root Mean Square Error (RMSE values from 0.21~0.23 to 0.02~0.09. In an effort to provide a more reliable prediction of optimum sowing time for the tested KB cultivars in various regions in the country, the optimized GA-BP-ANN models were applied to map spatial and temporal germination percentages of blue grass cultivars in China. Our results demonstrate that the GA-BP-ANN model is a convenient and reliable option for constructing thermal-germination response models since it automates model parameterization and has excellent prediction accuracy.
Accuracy assessment of pharmacogenetically predictive warfarin dosing algorithms in patients of an academic medical center anticoagulation clinic.

Science.gov (United States)

Shaw, Paul B; Donovan, Jennifer L; Tran, Maichi T; Lemon, Stephenie C; Burgwinkle, Pamela; Gore, Joel

2010-08-01

The objectives of this retrospective cohort study are to evaluate the accuracy of pharmacogenetic warfarin dosing algorithms in predicting therapeutic dose and to determine if this degree of accuracy warrants the routine use of genotyping to prospectively dose patients newly started on warfarin. Seventy-one patients of an outpatient anticoagulation clinic at an academic medical center who were age 18 years or older on a stable, therapeutic warfarin dose with international normalized ratio (INR) goal between 2.0 and 3.0, and cytochrome P450 isoenzyme 2C9 (CYP2C9) and vitamin K epoxide reductase complex subunit 1 (VKORC1) genotypes available between January 1, 2007 and September 30, 2008 were included. Six pharmacogenetic warfarin dosing algorithms were identified from the medical literature. Additionally, a 5 mg fixed dose approach was evaluated. Three algorithms, Zhu et al. (Clin Chem 53:1199-1205, 2007), Gage et al. (J Clin Ther 84:326-331, 2008), and International Warfarin Pharmacogenetic Consortium (IWPC) (N Engl J Med 360:753-764, 2009) were similar in the primary accuracy endpoints with mean absolute error (MAE) ranging from 1.7 to 1.8 mg/day and coefficient of determination R (2) from 0.61 to 0.66. However, the Zhu et al. algorithm severely over-predicted dose (defined as >or=2x or >or=2 mg/day more than actual dose) in twice as many (14 vs. 7%) patients as Gage et al. 2008 and IWPC 2009. In conclusion, the algorithms published by Gage et al. 2008 and the IWPC 2009 were the two most accurate pharmacogenetically based equations available in the medical literature in predicting therapeutic warfarin dose in our study population. However, the degree of accuracy demonstrated does not support the routine use of genotyping to prospectively dose all patients newly started on warfarin.
Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

Science.gov (United States)

Patel, Nihir; Wang, Jason T L

2015-10-01

Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.
PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains

Directory of Open Access Journals (Sweden)

Bayjanov Jumamurat R

2012-05-01

Full Text Available Abstract Background Linking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances. Thus, associating phenotypes to ~omics data requires an approach that is robust to noise and can handle large and diverse data sets. Results We developed a web-tool PhenoLink (http://bamics2.cmbi.ru.nl/websoftware/phenolink/ that links phenotype to ~omics data sets using well-established as well new techniques. PhenoLink imputes missing values and preprocesses input data (i to decrease inherent noise in the data and (ii to counterbalance pitfalls of the Random Forest algorithm, on which feature (e.g., gene selection is based. Preprocessed data is used in feature (e.g., gene selection to identify relations to phenotypes. We applied PhenoLink to identify gene-phenotype relations based on the presence/absence of 2847 genes in 42 Lactobacillus plantarum strains and phenotypic measurements of these strains in several experimental conditions, including growth on sugars and nitrogen-dioxide production. Genes were ranked based on their importance (predictive value to correctly predict the phenotype of a given strain. In addition to known gene to phenotype relations we also found novel relations. Conclusions PhenoLink is an easily accessible web-tool to facilitate identifying relations from large and often noisy phenotype and ~omics datasets. Visualization of links to phenotypes offered in PhenoLink allows prioritizing links, finding relations between features, finding relations between phenotypes, and identifying outliers in phenotype data. PhenoLink can be used to uncover phenotype links to a multitude of ~omics data, e.g., gene presence/absence (determined by e.g.: CGH or next-generation sequencing, gene expression (determined by e.g.: microarrays or RNA
A parallel algorithm for the initial screening of space debris collisions prediction using the SGP4/SDP4 models and GPU acceleration

Science.gov (United States)

Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

2017-05-01

Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation.
From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks.

KAUST Repository

Cannistraci, C.V.

2013-04-08

Growth and remodelling impact the network topology of complex systems, yet a general theory explaining how new links arise between existing nodes has been lacking, and little is known about the topological properties that facilitate link-prediction. Here we investigate the extent to which the connectivity evolution of a network might be predicted by mere topological features. We show how a link/community-based strategy triggers substantial prediction improvements because it accounts for the singular topology of several real networks organised in multiple local communities - a tendency here named local-community-paradigm (LCP). We observe that LCP networks are mainly formed by weak interactions and characterise heterogeneous and dynamic systems that use self-organisation as a major adaptation strategy. These systems seem designed for global delivery of information and processing via multiple local modules. Conversely, non-LCP networks have steady architectures formed by strong interactions, and seem designed for systems in which information/energy storage is crucial.
A Modified Spatiotemporal Fusion Algorithm Using Phenological Information for Predicting Reflectance of Paddy Rice in Southern China

Directory of Open Access Journals (Sweden)

Mengxue Liu

2018-05-01

Full Text Available Satellite data for studying surface dynamics in heterogeneous landscapes are missing due to frequent cloud contamination, low temporal resolution, and technological difficulties in developing satellites. A modified spatiotemporal fusion algorithm for predicting the reflectance of paddy rice is presented in this paper. The algorithm uses phenological information extracted from a moderate-resolution imaging spectroradiometer enhanced vegetation index time series to improve the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM. The algorithm is tested with satellite data on Yueyang City, China. The main contribution of the modified algorithm is the selection of similar neighborhood pixels by using phenological information to improve accuracy. Results show that the modified algorithm performs better than ESTARFM in visual inspection and quantitative metrics, especially for paddy rice. This modified algorithm provides not only new ideas for the improvement of spatiotemporal data fusion method, but also technical support for the generation of remote sensing data with high spatial and temporal resolution.
Prediction of Tibial Rotation Pathologies Using Particle Swarm Optimization and K-Means Algorithms.

Science.gov (United States)

Sari, Murat; Tuna, Can; Akogul, Serkan

2018-03-28

The aim of this article is to investigate pathological subjects from a population through different physical factors. To achieve this, particle swarm optimization (PSO) and K-means (KM) clustering algorithms have been combined (PSO-KM). Datasets provided by the literature were divided into three clusters based on age and weight parameters and each one of right tibial external rotation (RTER), right tibial internal rotation (RTIR), left tibial external rotation (LTER), and left tibial internal rotation (LTIR) values were divided into three types as Type 1, Type 2 and Type 3 (Type 2 is non-pathological (normal) and the other two types are pathological (abnormal)), respectively. The rotation values of every subject in any cluster were noted. Then the algorithm was run and the produced values were also considered. The values of the produced algorithm, the PSO-KM, have been compared with the real values. The hybrid PSO-KM algorithm has been very successful on the optimal clustering of the tibial rotation types through the physical criteria. In this investigation, Type 2 (pathological subjects) is of especially high predictability and the PSO-KM algorithm has been very successful as an operation system for clustering and optimizing the tibial motion data assessments. These research findings are expected to be very useful for health providers, such as physiotherapists, orthopedists, and so on, in which this consequence may help clinicians to appropriately designing proper treatment schedules for patients.
VISUALIZATION OF PAGERANK ALGORITHM

OpenAIRE

Perhaj, Ervin

2013-01-01

The goal of the thesis is to develop a web application that help users understand the functioning of the PageRank algorithm. The thesis consists of two parts. First we develop an algorithm to calculate PageRank values of web pages. The input of algorithm is a list of web pages and links between them. The user enters the list through the web interface. From the data the algorithm calculates PageRank value for each page. The algorithm repeats the process, until the difference of PageRank va...

A Model Predictive Algorithm for Active Control of Nonlinear Noise Processes

Directory of Open Access Journals (Sweden)

Qi-Zhi Zhang

2005-01-01

Full Text Available In this paper, an improved nonlinear Active Noise Control (ANC system is achieved by introducing an appropriate secondary source. For ANC system to be successfully implemented, the nonlinearity of the primary path and time delay of the secondary path must be overcome. A nonlinear Model Predictive Control (MPC strategy is introduced to deal with the time delay in the secondary path and the nonlinearity in the primary path of the ANC system. An overall online modeling technique is utilized for online secondary path and primary path estimation. The secondary path is estimated using an adaptive FIR filter, and the primary path is estimated using a Neural Network (NN. The two models are connected in parallel with the two paths. In this system, the mutual disturbances between the operation of the nonlinear ANC controller and modeling of the secondary can be greatly reduced. The coefficients of the adaptive FIR filter and weight vector of NN are adjusted online. Computer simulations are carried out to compare the proposed nonlinear MPC method with the nonlinear Filter-x Least Mean Square (FXLMS algorithm. The results showed that the convergence speed of the proposed nonlinear MPC algorithm is faster than that of nonlinear FXLMS algorithm. For testing the robust performance of the proposed nonlinear ANC system, the sudden changes in the secondary path and primary path of the ANC system are considered. Results indicated that the proposed nonlinear ANC system can rapidly track the sudden changes in the acoustic paths of the nonlinear ANC system, and ensure the adaptive algorithm stable when the nonlinear ANC system is time variable.
Improved feature selection based on genetic algorithms for real time disruption prediction on JET

Energy Technology Data Exchange (ETDEWEB)

Ratta, G.A., E-mail: garatta@gateme.unsj.edu.ar [GATEME, Facultad de Ingenieria, Universidad Nacional de San Juan, Avda. San Martin 1109 (O), 5400 San Juan (Argentina); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom); Vega, J. [Asociacion EURATOM/CIEMAT para Fusion, Avda. Complutense, 40, 28040 Madrid (Spain); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom); Murari, A. [Associazione EURATOM-ENEA per la Fusione, Consorzio RFX, 4-35127 Padova (Italy); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom)

2012-09-15

Highlights: Black-Right-Pointing-Pointer A new signal selection methodology to improve disruption prediction is reported. Black-Right-Pointing-Pointer The approach is based on Genetic Algorithms. Black-Right-Pointing-Pointer An advanced predictor has been created with the new set of signals. Black-Right-Pointing-Pointer The new system obtains considerably higher prediction rates. - Abstract: The early prediction of disruptions is an important aspect of the research in the field of Tokamak control. A very recent predictor, called 'Advanced Predictor Of Disruptions' (APODIS), developed for the 'Joint European Torus' (JET), implements the real time recognition of incoming disruptions with the best success rate achieved ever and an outstanding stability for long periods following training. In this article, a new methodology to select the set of the signals' parameters in order to maximize the performance of the predictor is reported. The approach is based on 'Genetic Algorithms' (GAs). With the feature selection derived from GAs, a new version of APODIS has been developed. The results are significantly better than the previous version not only in terms of success rates but also in extending the interval before the disruption in which reliable predictions are achieved. Correct disruption predictions with a success rate in excess of 90% have been achieved 200 ms before the time of the disruption. The predictor response is compared with that of JET's Protection System (JPS) and the ADODIS predictor is shown to be far superior. Both systems have been carefully tested with a wide number of discharges to understand their relative merits and the most profitable directions of further improvements.
The utility and limitations of current web-available algorithms to predict peptides recognized by CD4 T cells in response to pathogen infection #

Science.gov (United States)

Chaves, Francisco A.; Lee, Alvin H.; Nayak, Jennifer; Richards, Katherine A.; Sant, Andrea J.

2012-01-01

The ability to track CD4 T cells elicited in response to pathogen infection or vaccination is critical because of the role these cells play in protective immunity. Coupled with advances in genome sequencing of pathogenic organisms, there is considerable appeal for implementation of computer-based algorithms to predict peptides that bind to the class II molecules, forming the complex recognized by CD4 T cells. Despite recent progress in this area, there is a paucity of data regarding their success in identifying actual pathogen-derived epitopes. In this study, we sought to rigorously evaluate the performance of multiple web-available algorithms by comparing their predictions and our results using purely empirical methods for epitope discovery in influenza that utilized overlapping peptides and cytokine Elispots, for three independent class II molecules. We analyzed the data in different ways, trying to anticipate how an investigator might use these computational tools for epitope discovery. We come to the conclusion that currently available algorithms can indeed facilitate epitope discovery, but all shared a high degree of false positive and false negative predictions. Therefore, efficiencies were low. We also found dramatic disparities among algorithms and between predicted IC50 values and true dissociation rates of peptide:MHC class II complexes. We suggest that improved success of predictive algorithms will depend less on changes in computational methods or increased data sets and more on changes in parameters used to “train” the algorithms that factor in elements of T cell repertoire and peptide acquisition by class II molecules. PMID:22467652
Distributed and Cooperative Link Scheduling for Large-Scale Multihop Wireless Networks

Directory of Open Access Journals (Sweden)

Swami Ananthram

2007-01-01

Full Text Available A distributed and cooperative link-scheduling (DCLS algorithm is introduced for large-scale multihop wireless networks. With this algorithm, each and every active link in the network cooperatively calibrates its environment and converges to a desired link schedule for data transmissions within a time frame of multiple slots. This schedule is such that the entire network is partitioned into a set of interleaved subnetworks, where each subnetwork consists of concurrent cochannel links that are properly separated from each other. The desired spacing in each subnetwork can be controlled by a tuning parameter and the number of time slots specified for each frame. Following the DCLS algorithm, a distributed and cooperative power control (DCPC algorithm can be applied to each subnetwork to ensure a desired data rate for each link with minimum network transmission power. As shown consistently by simulations, the DCLS algorithm along with a DCPC algorithm yields significant power savings. The power savings also imply an increased feasible region of averaged link data rates for the entire network.
Distributed and Cooperative Link Scheduling for Large-Scale Multihop Wireless Networks

Directory of Open Access Journals (Sweden)

Ananthram Swami

2007-12-01

Full Text Available A distributed and cooperative link-scheduling (DCLS algorithm is introduced for large-scale multihop wireless networks. With this algorithm, each and every active link in the network cooperatively calibrates its environment and converges to a desired link schedule for data transmissions within a time frame of multiple slots. This schedule is such that the entire network is partitioned into a set of interleaved subnetworks, where each subnetwork consists of concurrent cochannel links that are properly separated from each other. The desired spacing in each subnetwork can be controlled by a tuning parameter and the number of time slots specified for each frame. Following the DCLS algorithm, a distributed and cooperative power control (DCPC algorithm can be applied to each subnetwork to ensure a desired data rate for each link with minimum network transmission power. As shown consistently by simulations, the DCLS algorithm along with a DCPC algorithm yields significant power savings. The power savings also imply an increased feasible region of averaged link data rates for the entire network.
Online available capacity prediction and state of charge estimation based on advanced data-driven algorithms for lithium iron phosphate battery

International Nuclear Information System (INIS)

Deng, Zhongwei; Yang, Lin; Cai, Yishan; Deng, Hao; Sun, Liu

2016-01-01

The key technology of a battery management system is to online estimate the battery states accurately and robustly. For lithium iron phosphate battery, the relationship between state of charge and open circuit voltage has a plateau region which limits the estimation accuracy of voltage-based algorithms. The open circuit voltage hysteresis requires advanced online identification algorithms to cope with the strong nonlinear battery model. The available capacity, as a crucial parameter, contributes to the state of charge and state of health estimation of battery, but it is difficult to predict due to comprehensive influence by temperature, aging and current rates. Aim at above problems, the ampere-hour counting with current correction and the dual adaptive extended Kalman filter algorithms are combined to estimate model parameters and state of charge. This combination presents the advantages of less computation burden and more robustness. Considering the influence of temperature and degradation, the data-driven algorithm namely least squares support vector machine is implemented to predict the available capacity. The state estimation and capacity prediction methods are coupled to improve the estimation accuracy at different temperatures among the lifetime of battery. The experiment results verify the proposed methods have excellent state and available capacity estimation accuracy. - Highlights: • A dual adaptive extended Kalman filter is used to estimate parameters and states. • A correction term is introduced to consider the effect of current rates. • The least square support vector machine is used to predict the available capacity. • The experiment results verify the proposed state and capacity prediction methods.
WDM Multicast Tree Construction Algorithms and Their Comparative Evaluations

Science.gov (United States)

Makabe, Tsutomu; Mikoshi, Taiju; Takenaka, Toyofumi

We propose novel tree construction algorithms for multicast communication in photonic networks. Since multicast communications consume many more link resources than unicast communications, effective algorithms for route selection and wavelength assignment are required. We propose a novel tree construction algorithm, called the Weighted Steiner Tree (WST) algorithm and a variation of the WST algorithm, called the Composite Weighted Steiner Tree (CWST) algorithm. Because these algorithms are based on the Steiner Tree algorithm, link resources among source and destination pairs tend to be commonly used and link utilization ratios are improved. Because of this, these algorithms can accept many more multicast requests than other multicast tree construction algorithms based on the Dijkstra algorithm. However, under certain delay constraints, the blocking characteristics of the proposed Weighted Steiner Tree algorithm deteriorate since some light paths between source and destinations use many hops and cannot satisfy the delay constraint. In order to adapt the approach to the delay-sensitive environments, we have devised the Composite Weighted Steiner Tree algorithm comprising the Weighted Steiner Tree algorithm and the Dijkstra algorithm for use in a delay constrained environment such as an IPTV application. In this paper, we also give the results of simulation experiments which demonstrate the superiority of the proposed Composite Weighted Steiner Tree algorithm compared with the Distributed Minimum Hop Tree (DMHT) algorithm, from the viewpoint of the light-tree request blocking.
Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm

Science.gov (United States)

Öhrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

2010-01-01

One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes. PMID:20864443
Validation Study of a Predictive Algorithm to Evaluate Opioid Use Disorder in a Primary Care Setting

Science.gov (United States)

Sharma, Maneesh; Lee, Chee; Kantorovich, Svetlana; Tedtaotao, Maria; Smith, Gregory A.

2017-01-01

Background: Opioid abuse in chronic pain patients is a major public health issue. Primary care providers are frequently the first to prescribe opioids to patients suffering from pain, yet do not always have the time or resources to adequately evaluate the risk of opioid use disorder (OUD). Purpose: This study seeks to determine the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm (“profile”) incorporating phenotypic and, more uniquely, genotypic risk factors. Methods and Results: In a validation study with 452 participants diagnosed with OUD and 1237 controls, the algorithm successfully categorized patients at high and moderate risk of OUD with 91.8% sensitivity. Regardless of changes in the prevalence of OUD, sensitivity of the algorithm remained >90%. Conclusion: The algorithm correctly stratifies primary care patients into low-, moderate-, and high-risk categories to appropriately identify patients in need for additional guidance, monitoring, or treatment changes. PMID:28890908
Computational intelligence, medicine and biology selected links

CERN Document Server

Zaitseva, Elena

2015-01-01

This book contains an interesting and state-of the art collection of chapters presenting several examples of attempts to developing modern tools utilizing computational intelligence in different real life problems encountered by humans. Reasoning, prediction, modeling, optimization, decision making, etc. need modern, soft and intelligent algorithms, methods and methodologies to solve, in the efficient ways, problems appearing in human activity. The contents of the book is divided into two parts. Part I, consisting of four chapters, is devoted to selected links of computational intelligence, medicine, health care and biomechanics. Several problems are considered: estimation of healthcare system reliability, classification of ultrasound thyroid images, application of fuzzy logic to measure weight status and central fatness, and deriving kinematics directly from video records. Part II, also consisting of four chapters, is devoted to selected links of computational intelligence and biology. The common denominato...
An early-biomarker algorithm predicts lethal graft-versus-host disease and survival.

Science.gov (United States)

Hartwell, Matthew J; Özbek, Umut; Holler, Ernst; Renteria, Anne S; Major-Monfried, Hannah; Reddy, Pavan; Aziz, Mina; Hogan, William J; Ayuk, Francis; Efebera, Yvonne A; Hexner, Elizabeth O; Bunworasate, Udomsak; Qayed, Muna; Ordemann, Rainer; Wölfl, Matthias; Mielke, Stephan; Pawarode, Attaphol; Chen, Yi-Bin; Devine, Steven; Harris, Andrew C; Jagasia, Madan; Kitko, Carrie L; Litzow, Mark R; Kröger, Nicolaus; Locatelli, Franco; Morales, George; Nakamura, Ryotaro; Reshef, Ran; Rösler, Wolf; Weber, Daniela; Wudhikarn, Kitsada; Yanik, Gregory A; Levine, John E; Ferrara, James L M

2017-02-09

BACKGROUND. No laboratory test can predict the risk of nonrelapse mortality (NRM) or severe graft-versus-host disease (GVHD) after hematopoietic cellular transplantation (HCT) prior to the onset of GVHD symptoms. METHODS. Patient blood samples on day 7 after HCT were obtained from a multicenter set of 1,287 patients, and 620 samples were assigned to a training set. We measured the concentrations of 4 GVHD biomarkers (ST2, REG3α, TNFR1, and IL-2Rα) and used them to model 6-month NRM using rigorous cross-validation strategies to identify the best algorithm that defined 2 distinct risk groups. We then applied the final algorithm in an independent test set ( n = 309) and validation set ( n = 358). RESULTS. A 2-biomarker model using ST2 and REG3α concentrations identified patients with a cumulative incidence of 6-month NRM of 28% in the high-risk group and 7% in the low-risk group ( P < 0.001). The algorithm performed equally well in the test set (33% vs. 7%, P < 0.001) and the multicenter validation set (26% vs. 10%, P < 0.001). Sixteen percent, 17%, and 20% of patients were at high risk in the training, test, and validation sets, respectively. GVHD-related mortality was greater in high-risk patients (18% vs. 4%, P < 0.001), as was severe gastrointestinal GVHD (17% vs. 8%, P < 0.001). The same algorithm can be successfully adapted to define 3 distinct risk groups at GVHD onset. CONCLUSION. A biomarker algorithm based on a blood sample taken 7 days after HCT can consistently identify a group of patients at high risk for lethal GVHD and NRM. FUNDING. The National Cancer Institute, American Cancer Society, and the Doris Duke Charitable Foundation.
Conjugate-Gradient Algorithms For Dynamics Of Manipulators

Science.gov (United States)

Fijany, Amir; Scheid, Robert E.

1993-01-01

Algorithms for serial and parallel computation of forward dynamics of multiple-link robotic manipulators by conjugate-gradient method developed. Parallel algorithms have potential for speedup of computations on multiple linked, specialized processors implemented in very-large-scale integrated circuits. Such processors used to stimulate dynamics, possibly faster than in real time, for purposes of planning and control.
Earthquake prediction analysis based on empirical seismic rate: the M8 algorithm

Science.gov (United States)

Molchan, G.; Romashkova, L.

2010-12-01

The quality of space-time earthquake prediction is usually characterized by a 2-D error diagram (n, τ), where n is the fraction of failures-to-predict and τ is the local rate of alarm averaged in space. The most reasonable averaging measure for analysis of a prediction strategy is the normalized rate of target events λ(dg) in a subarea dg. In that case the quantity H = 1 - (n + τ) determines the prediction capability of the strategy. The uncertainty of λ(dg) causes difficulties in estimating H and the statistical significance, α, of prediction results. We investigate this problem theoretically and show how the uncertainty of the measure can be taken into account in two situations, viz., the estimation of α and the construction of a confidence zone for the (n, τ)-parameters of the random strategies. We use our approach to analyse the results from prediction of M >= 8.0 events by the M8 method for the period 1985-2009 (the M8.0+ test). The model of λ(dg) based on the events Mw >= 5.5, 1977-2004, and the magnitude range of target events 8.0 <= M < 8.5 are considered as basic to this M8 analysis. We find the point and upper estimates of α and show that they are still unstable because the number of target events in the experiment is small. However, our results argue in favour of non-triviality of the M8 prediction algorithm.
Predicting peptides binding to MHC class II molecules using multi-objective evolutionary algorithms

Directory of Open Access Journals (Sweden)

Feng Lin

2007-11-01

Full Text Available Abstract Background Peptides binding to Major Histocompatibility Complex (MHC class II molecules are crucial for initiation and regulation of immune responses. Predicting peptides that bind to a specific MHC molecule plays an important role in determining potential candidates for vaccines. The binding groove in class II MHC is open at both ends, allowing peptides longer than 9-mer to bind. Finding the consensus motif facilitating the binding of peptides to a MHC class II molecule is difficult because of different lengths of binding peptides and varying location of 9-mer binding core. The level of difficulty increases when the molecule is promiscuous and binds to a large number of low affinity peptides. In this paper, we propose two approaches using multi-objective evolutionary algorithms (MOEA for predicting peptides binding to MHC class II molecules. One uses the information from both binders and non-binders for self-discovery of motifs. The other, in addition, uses information from experimentally determined motifs for guided-discovery of motifs. Results The proposed methods are intended for finding peptides binding to MHC class II I-Ag7 molecule – a promiscuous binder to a large number of low affinity peptides. Cross-validation results across experiments on two motifs derived for I-Ag7 datasets demonstrate better generalization abilities and accuracies of the present method over earlier approaches. Further, the proposed method was validated and compared on two publicly available benchmark datasets: (1 an ensemble of qualitative HLA-DRB1*0401 peptide data obtained from five different sources, and (2 quantitative peptide data obtained for sixteen different alleles comprising of three mouse alleles and thirteen HLA alleles. The proposed method outperformed earlier methods on most datasets, indicating that it is well suited for finding peptides binding to MHC class II molecules. Conclusion We present two MOEA-based algorithms for finding motifs
A Dual-Channel Acquisition Method Based on Extended Replica Folding Algorithm for Long Pseudo-Noise Code in Inter-Satellite Links.

Science.gov (United States)

Zhao, Hongbo; Chen, Yuying; Feng, Wenquan; Zhuang, Chen

2018-05-25

Inter-satellite links are an important component of the new generation of satellite navigation systems, characterized by low signal-to-noise ratio (SNR), complex electromagnetic interference and the short time slot of each satellite, which brings difficulties to the acquisition stage. The inter-satellite link in both Global Positioning System (GPS) and BeiDou Navigation Satellite System (BDS) adopt the long code spread spectrum system. However, long code acquisition is a difficult and time-consuming task due to the long code period. Traditional folding methods such as extended replica folding acquisition search technique (XFAST) and direct average are largely restricted because of code Doppler and additional SNR loss caused by replica folding. The dual folding method (DF-XFAST) and dual-channel method have been proposed to achieve long code acquisition in low SNR and high dynamic situations, respectively, but the former is easily affected by code Doppler and the latter is not fast enough. Considering the environment of inter-satellite links and the problems of existing algorithms, this paper proposes a new long code acquisition algorithm named dual-channel acquisition method based on the extended replica folding algorithm (DC-XFAST). This method employs dual channels for verification. Each channel contains an incoming signal block. Local code samples are folded and zero-padded to the length of the incoming signal block. After a circular FFT operation, the correlation results contain two peaks of the same magnitude and specified relative position. The detection process is eased through finding the two largest values. The verification takes all the full and partial peaks into account. Numerical results reveal that the DC-XFAST method can improve acquisition performance while acquisition speed is guaranteed. The method has a significantly higher acquisition probability than folding methods XFAST and DF-XFAST. Moreover, with the advantage of higher detection
A Dual-Channel Acquisition Method Based on Extended Replica Folding Algorithm for Long Pseudo-Noise Code in Inter-Satellite Links

Directory of Open Access Journals (Sweden)

Hongbo Zhao

2018-05-01

Full Text Available Inter-satellite links are an important component of the new generation of satellite navigation systems, characterized by low signal-to-noise ratio (SNR, complex electromagnetic interference and the short time slot of each satellite, which brings difficulties to the acquisition stage. The inter-satellite link in both Global Positioning System (GPS and BeiDou Navigation Satellite System (BDS adopt the long code spread spectrum system. However, long code acquisition is a difficult and time-consuming task due to the long code period. Traditional folding methods such as extended replica folding acquisition search technique (XFAST and direct average are largely restricted because of code Doppler and additional SNR loss caused by replica folding. The dual folding method (DF-XFAST and dual-channel method have been proposed to achieve long code acquisition in low SNR and high dynamic situations, respectively, but the former is easily affected by code Doppler and the latter is not fast enough. Considering the environment of inter-satellite links and the problems of existing algorithms, this paper proposes a new long code acquisition algorithm named dual-channel acquisition method based on the extended replica folding algorithm (DC-XFAST. This method employs dual channels for verification. Each channel contains an incoming signal block. Local code samples are folded and zero-padded to the length of the incoming signal block. After a circular FFT operation, the correlation results contain two peaks of the same magnitude and specified relative position. The detection process is eased through finding the two largest values. The verification takes all the full and partial peaks into account. Numerical results reveal that the DC-XFAST method can improve acquisition performance while acquisition speed is guaranteed. The method has a significantly higher acquisition probability than folding methods XFAST and DF-XFAST. Moreover, with the advantage of higher
MSD-MAP: A Network-Based Systems Biology Platform for Predicting Disease-Metabolite Links.

Science.gov (United States)

Wathieu, Henri; Issa, Naiem T; Mohandoss, Manisha; Byers, Stephen W; Dakshanamurthy, Sivanesan

2017-01-01

Cancer-associated metabolites result from cell-wide mechanisms of dysregulation. The field of metabolomics has sought to identify these aberrant metabolites as disease biomarkers, clues to understanding disease mechanisms, or even as therapeutic agents. This study was undertaken to reliably predict metabolites associated with colorectal, esophageal, and prostate cancers. Metabolite and disease biological action networks were compared in a computational platform called MSD-MAP (Multi Scale Disease-Metabolite Association Platform). Using differential gene expression analysis with patient-based RNAseq data from The Cancer Genome Atlas, genes up- or down-regulated in cancer compared to normal tissue were identified. Relational databases were used to map biological entities including pathways, functions, and interacting proteins, to those differential disease genes. Similar relational maps were built for metabolites, stemming from known and in silico predicted metabolite-protein associations. The hypergeometric test was used to find statistically significant relationships between disease and metabolite biological signatures at each tier, and metabolites were assessed for multi-scale association with each cancer. Metabolite networks were also directly associated with various other diseases using a disease functional perturbation database. Our platform recapitulated metabolite-disease links that have been empirically verified in the scientific literature, with network-based mapping of jointly-associated biological activity also matching known disease mechanisms. This was true for colorectal, esophageal, and prostate cancers, using metabolite action networks stemming from both predicted and known functional protein associations. By employing systems biology concepts, MSD-MAP reliably predicted known cancermetabolite links, and may serve as a predictive tool to streamline conventional metabolomic profiling methodologies. Copyright© Bentham Science Publishers; For any
Design of a fuzzy differential evolution algorithm to predict non-deposition sediment transport

Science.gov (United States)

Ebtehaj, Isa; Bonakdari, Hossein

2017-12-01

Since the flow entering a sewer contains solid matter, deposition at the bottom of the channel is inevitable. It is difficult to understand the complex, three-dimensional mechanism of sediment transport in sewer pipelines. Therefore, a method to estimate the limiting velocity is necessary for optimal designs. Due to the inability of gradient-based algorithms to train Adaptive Neuro-Fuzzy Inference Systems (ANFIS) for non-deposition sediment transport prediction, a new hybrid ANFIS method based on a differential evolutionary algorithm (ANFIS-DE) is developed. The training and testing performance of ANFIS-DE is evaluated using a wide range of dimensionless parameters gathered from the literature. The input combination used to estimate the densimetric Froude number ( Fr) parameters includes the volumetric sediment concentration ( C V ), ratio of median particle diameter to hydraulic radius ( d/R), ratio of median particle diameter to pipe diameter ( d/D) and overall friction factor of sediment ( λ s ). The testing results are compared with the ANFIS model and regression-based equation results. The ANFIS-DE technique predicted sediment transport at limit of deposition with lower root mean square error (RMSE = 0.323) and mean absolute percentage of error (MAPE = 0.065) and higher accuracy ( R 2 = 0.965) than the ANFIS model and regression-based equations.
CorVue algorithm efficacy to predict heart failure in real life: Unnecessary and potentially misleading information?

Science.gov (United States)

Palfy, Julia Anna; Benezet-Mazuecos, Juan; Milla, Juan Martinez; Iglesias, Jose Antonio; de la Vieja, Juan Jose; Sanchez-Borque, Pepa; Miracle, Angel; Rubio, Jose Manuel

2018-06-01

Heart failure (HF) hospitalizations have a negative impact on quality of life and imply important costs. Intrathoracic impedance (ITI) variations detected by cardiac devices have been hypothesized to predict HF hospitalizations. Although Optivol™ algorithm (Medtronic) has been widely studied, CorVue™ algorithm (St. Jude Medical) long term efficacy has not been systematically evaluated in a "real life" cohort. CorVue™ was activated in ICD/CRT-D patients to store information about ITI measures. Clinical events (new episodes of HF requiring treatment and hospitalizations) and CorVue™ data were recorded every three months. Appropriate CorVue™ detection for HF was considered if it occurred in the four prior weeks to the clinical event. 53 ICD/CRT-D (26 ICD and 27 CRT-D) patients (67±1 years-old, 79% male) were included. Device position was subcutaneous in 28 patients. At inclusion, mean LVEF was 25±7% and 27 patients (51%) were in NYHA class I, 18 (34%) class II and 8 (15%) class III. After a mean follow-up of 17±9 months, 105 ITI drops alarms were detected in 32 patients (60%). Only six alarms were appropriate (true positive) and required hospitalization. Eighteen patients (34%) presented 25 clinical episodes (12 hospitalizations and 13 ER/ambulatory treatment modifications). Nineteen of these clinical episodes (76%) remained undetected by the CorVue™ (false negative). Sensitivity of CorVue™ resulted in 24%, specificity was 70%, positive predictive value of 6% and negative predictive value of 93%. CorVue™ showed a low sensitivity to predict HF events. Therefore, routinely activation of this algorithm could generate misleading information. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Algorithmic phase diagrams

Science.gov (United States)

Hockney, Roger

1987-01-01

Algorithmic phase diagrams are a neat and compact representation of the results of comparing the execution time of several algorithms for the solution of the same problem. As an example, the recent results are shown of Gannon and Van Rosendale on the solution of multiple tridiagonal systems of equations in the form of such diagrams. The act of preparing these diagrams has revealed an unexpectedly complex relationship between the best algorithm and the number and size of the tridiagonal systems, which was not evident from the algebraic formulae in the original paper. Even so, for a particular computer, one diagram suffices to predict the best algorithm for all problems that are likely to be encountered the prediction being read directly from the diagram without complex calculation.

Monitoring of the future strong Vrancea events by using the CN formal earthquake prediction algorithm

International Nuclear Information System (INIS)

Moldoveanu, C.L.; Novikova, O.V.; Panza, G.F.; Radulian, M.

2003-06-01

The preparation process of the strong subcrustal events originating in Vrancea region, Romania, is monitored using an intermediate-term medium-range earthquake prediction method - the CN algorithm (Keilis-Borok and Rotwain, 1990). We present the results of the monitoring of the preparation of future strong earthquakes for the time interval from January 1, 1994 (1994.1.1), to January 1, 2003 (2003.1.1) using the updated catalogue of the Romanian local network. The database considered for the CN monitoring of the preparation of future strong earthquakes in Vrancea covers the period from 1966.3.1 to 2003.1.1 and the geographical rectangle 44.8 deg - 48.4 deg N, 25.0 deg - 28.0 deg E. The algorithm correctly identifies, by retrospective prediction, the TJPs for all the three strong earthquakes (Mo=6.4) that occurred in Vrancea during this period. The cumulated duration of the TIPs represents 26.5% of the total period of time considered (1966.3.1-2003.1.1). The monitoring of current seismicity using the algorithm CN has been carried out since 1994. No strong earthquakes occurred from 1994.1.1 to 2003.1.1 but the CN declared an extended false alarm from 1999.5.1 to 2000.11.1. No alarm has currently been declared in the region (on January 1, 2003), as can be seen from the TJPs diagram shown. (author)
Link Prediction Methods and Their Accuracy for Different Social Networks and Network Metrics

Directory of Open Access Journals (Sweden)

Fei Gao

2015-01-01

Full Text Available Currently, we are experiencing a rapid growth of the number of social-based online systems. The availability of the vast amounts of data gathered in those systems brings new challenges that we face when trying to analyse it. One of the intensively researched topics is the prediction of social connections between users. Although a lot of effort has been made to develop new prediction approaches, the existing methods are not comprehensively analysed. In this paper we investigate the correlation between network metrics and accuracy of different prediction methods. We selected six time-stamped real-world social networks and ten most widely used link prediction methods. The results of the experiments show that the performance of some methods has a strong correlation with certain network metrics. We managed to distinguish “prediction friendly” networks, for which most of the prediction methods give good performance, as well as “prediction unfriendly” networks, for which most of the methods result in high prediction error. Correlation analysis between network metrics and prediction accuracy of prediction methods may form the basis of a metalearning system where based on network characteristics it will be able to recommend the right prediction method for a given network.
Novel Intermode Prediction Algorithm for High Efficiency Video Coding Encoder

Directory of Open Access Journals (Sweden)

Chan-seob Park

2014-01-01

Full Text Available The joint collaborative team on video coding (JCT-VC is developing the next-generation video coding standard which is called high efficiency video coding (HEVC. In the HEVC, there are three units in block structure: coding unit (CU, prediction unit (PU, and transform unit (TU. The CU is the basic unit of region splitting like macroblock (MB. Each CU performs recursive splitting into four blocks with equal size, starting from the tree block. In this paper, we propose a fast CU depth decision algorithm for HEVC technology to reduce its computational complexity. In 2N×2N PU, the proposed method compares the rate-distortion (RD cost and determines the depth using the compared information. Moreover, in order to speed up the encoding time, the efficient merge SKIP detection method is developed additionally based on the contextual mode information of neighboring CUs. Experimental result shows that the proposed algorithm achieves the average time-saving factor of 44.84% in the random access (RA at Main profile configuration with the HEVC test model (HM 10.0 reference software. Compared to HM 10.0 encoder, a small BD-bitrate loss of 0.17% is also observed without significant loss of image quality.
A fast, parallel algorithm to solve the basic fluvial erosion/transport equations

Science.gov (United States)

Braun, J.

2012-04-01

Quantitative models of landform evolution are commonly based on the solution of a set of equations representing the processes of fluvial erosion, transport and deposition, which leads to predict the geometry of a river channel network and its evolution through time. The river network is often regarded as the backbone of any surface processes model (SPM) that might include other physical processes acting at a range of spatial and temporal scales along hill slopes. The basic laws of fluvial erosion requires the computation of local (slope) and non-local (drainage area) quantities at every point of a given landscape, a computationally expensive operation which limits the resolution of most SPMs. I present here an algorithm to compute the various components required in the parameterization of fluvial erosion (and transport) and thus solve the basic fluvial geomorphic equation, that is very efficient because it is O(n) (the number of required arithmetic operations is linearly proportional to the number of nodes defining the landscape), and is fully parallelizable (the computation cost decreases in a direct inverse proportion to the number of processors used to solve the problem). The algorithm is ideally suited for use on latest multi-core processors. Using this new technique, geomorphic problems can be solved at an unprecedented resolution (typically of the order of 10,000 X 10,000 nodes) while keeping the computational cost reasonable (order 1 sec per time step). Furthermore, I will show that the algorithm is applicable to any regular or irregular representation of the landform, and is such that the temporal evolution of the landform can be discretized by a fully implicit time-marching algorithm, making it unconditionally stable. I will demonstrate that such an efficient algorithm is ideally suited to produce a fully predictive SPM that links observationally based parameterizations of small-scale processes to the evolution of large-scale features of the landscapes on
A new avenue for classification and prediction of olive cultivars using supervised and unsupervised algorithms.

Directory of Open Access Journals (Sweden)

Amir H Beiki

Full Text Available Various methods have been used to identify cultivares of olive trees; herein we used different bioinformatics algorithms to propose new tools to classify 10 cultivares of olive based on RAPD and ISSR genetic markers datasets generated from PCR reactions. Five RAPD markers (OPA0a21, OPD16a, OP01a1, OPD16a1 and OPA0a8 and five ISSR markers (UBC841a4, UBC868a7, UBC841a14, U12BC807a and UBC810a13 selected as the most important markers by all attribute weighting models. K-Medoids unsupervised clustering run on SVM dataset was fully able to cluster each olive cultivar to the right classes. All trees (176 induced by decision tree models generated meaningful trees and UBC841a4 attribute clearly distinguished between foreign and domestic olive cultivars with 100% accuracy. Predictive machine learning algorithms (SVM and Naïve Bayes were also able to predict the right class of olive cultivares with 100% accuracy. For the first time, our results showed data mining techniques can be effectively used to distinguish between plant cultivares and proposed machine learning based systems in this study can predict new olive cultivars with the best possible accuracy.
PEDLA: predicting enhancers with a deep learning-based algorithmic framework.

Science.gov (United States)

Liu, Feng; Li, Hao; Ren, Chao; Bo, Xiaochen; Shu, Wenjie

2016-06-22

Transcriptional enhancers are non-coding segments of DNA that play a central role in the spatiotemporal regulation of gene expression programs. However, systematically and precisely predicting enhancers remain a major challenge. Although existing methods have achieved some success in enhancer prediction, they still suffer from many issues. We developed a deep learning-based algorithmic framework named PEDLA (https://github.com/wenjiegroup/PEDLA), which can directly learn an enhancer predictor from massively heterogeneous data and generalize in ways that are mostly consistent across various cell types/tissues. We first trained PEDLA with 1,114-dimensional heterogeneous features in H1 cells, and demonstrated that PEDLA framework integrates diverse heterogeneous features and gives state-of-the-art performance relative to five existing methods for enhancer prediction. We further extended PEDLA to iteratively learn from 22 training cell types/tissues. Our results showed that PEDLA manifested superior performance consistency in both training and independent test sets. On average, PEDLA achieved 95.0% accuracy and a 96.8% geometric mean (GM) of sensitivity and specificity across 22 training cell types/tissues, as well as 95.7% accuracy and a 96.8% GM across 20 independent test cell types/tissues. Together, our work illustrates the power of harnessing state-of-the-art deep learning techniques to consistently identify regulatory elements at a genome-wide scale from massively heterogeneous data across diverse cell types/tissues.
The novel EuroSCORE II algorithm predicts the hospital mortality of thoracic aortic surgery in 461 consecutive Japanese patients better than both the original additive and logistic EuroSCORE algorithms.

Science.gov (United States)

Nishida, Takahiro; Sonoda, Hiromichi; Oishi, Yasuhisa; Tanoue, Yoshihisa; Nakashima, Atsuhiro; Shiokawa, Yuichi; Tominaga, Ryuji

2014-04-01

The European System for Cardiac Operative Risk Evaluation (EuroSCORE) II was developed to improve the overestimation of surgical risk associated with the original (additive and logistic) EuroSCOREs. The purpose of this study was to evaluate the significance of the EuroSCORE II by comparing its performance with that of the original EuroSCOREs in Japanese patients undergoing surgery on the thoracic aorta. We have calculated the predicted mortalities according to the additive EuroSCORE, logistic EuroSCORE and EuroSCORE II algorithms in 461 patients who underwent surgery on the thoracic aorta during a period of 20 years (1993-2013). The actual in-hospital mortality rates in the low- (additive EuroSCORE of 3-6), moderate- (7-11) and high-risk (≥11) groups (followed by overall mortality) were 1.3, 6.2 and 14.4% (7.2% overall), respectively. Among the three different risk groups, the expected mortality rates were 5.5 ± 0.6, 9.1 ± 0.7 and 13.5 ± 0.2% (9.5 ± 0.1% overall) by the additive EuroSCORE algorithm, 5.3 ± 0.1, 16 ± 0.4 and 42.4 ± 1.3% (19.9 ± 0.7% overall) by the logistic EuroSCORE algorithm and 1.6 ± 0.1, 5.2 ± 0.2 and 18.5 ± 1.3% (7.4 ± 0.4% overall) by the EuroSCORE II algorithm, indicating poor prediction (P algorithms were 0.6937, 0.7169 and 0.7697, respectively. Thus, the mortality expected by the EuroSCORE II more closely matched the actual mortality in all three risk groups. In contrast, the mortality expected by the logistic EuroSCORE overestimated the risks in the moderate- (P = 0.0002) and high-risk (P < 0.0001) patient groups. Although all of the original EuroSCOREs and EuroSCORE II appreciably predicted the surgical mortality for thoracic aortic surgery in Japanese patients, the EuroSCORE II best predicted the mortalities in all risk groups.
Improved algorithms and methods for room sound-field prediction by acoustical radiosity in arbitrary polyhedral rooms

Science.gov (United States)

Nosal, Eva-Marie; Hodgson, Murray; Ashdown, Ian

2004-08-01

This paper explores acoustical (or time-dependent) radiosity-a geometrical-acoustics sound-field prediction method that assumes diffuse surface reflection. The literature of acoustical radiosity is briefly reviewed and the advantages and disadvantages of the method are discussed. A discrete form of the integral equation that results from meshing the enclosure boundaries into patches is presented and used in a discrete-time algorithm. Furthermore, an averaging technique is used to reduce computational requirements. To generalize to nonrectangular rooms, a spherical-triangle method is proposed as a means of evaluating the integrals over solid angles that appear in the discrete form of the integral equation. The evaluation of form factors, which also appear in the numerical solution, is discussed for rectangular and nonrectangular rooms. This algorithm and associated methods are validated by comparison of the steady-state predictions for a spherical enclosure to analytical solutions.
Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

Science.gov (United States)

Olivera, André Rodrigues; Roesler, Valter; Iochpe, Cirano; Schmidt, Maria Inês; Vigo, Álvaro; Barreto, Sandhi Maria; Duncan, Bruce Bartholow

2017-01-01

Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest. The best models were created using artificial neural networks and logistic regression. -These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.
A Localization Method for Underwater Wireless Sensor Networks Based on Mobility Prediction and Particle Swarm Optimization Algorithms

Directory of Open Access Journals (Sweden)

Ying Zhang

2016-02-01

Full Text Available Due to their special environment, Underwater Wireless Sensor Networks (UWSNs are usually deployed over a large sea area and the nodes are usually floating. This results in a lower beacon node distribution density, a longer time for localization, and more energy consumption. Currently most of the localization algorithms in this field do not pay enough consideration on the mobility of the nodes. In this paper, by analyzing the mobility patterns of water near the seashore, a localization method for UWSNs based on a Mobility Prediction and a Particle Swarm Optimization algorithm (MP-PSO is proposed. In this method, the range-based PSO algorithm is used to locate the beacon nodes, and their velocities can be calculated. The velocity of an unknown node is calculated by using the spatial correlation of underwater object’s mobility, and then their locations can be predicted. The range-based PSO algorithm may cause considerable energy consumption and its computation complexity is a little bit high, nevertheless the number of beacon nodes is relatively smaller, so the calculation for the large number of unknown nodes is succinct, and this method can obviously decrease the energy consumption and time cost of localizing these mobile nodes. The simulation results indicate that this method has higher localization accuracy and better localization coverage rate compared with some other widely used localization methods in this field.
A Localization Method for Underwater Wireless Sensor Networks Based on Mobility Prediction and Particle Swarm Optimization Algorithms.

Science.gov (United States)

Zhang, Ying; Liang, Jixing; Jiang, Shengming; Chen, Wei

2016-02-06

Due to their special environment, Underwater Wireless Sensor Networks (UWSNs) are usually deployed over a large sea area and the nodes are usually floating. This results in a lower beacon node distribution density, a longer time for localization, and more energy consumption. Currently most of the localization algorithms in this field do not pay enough consideration on the mobility of the nodes. In this paper, by analyzing the mobility patterns of water near the seashore, a localization method for UWSNs based on a Mobility Prediction and a Particle Swarm Optimization algorithm (MP-PSO) is proposed. In this method, the range-based PSO algorithm is used to locate the beacon nodes, and their velocities can be calculated. The velocity of an unknown node is calculated by using the spatial correlation of underwater object's mobility, and then their locations can be predicted. The range-based PSO algorithm may cause considerable energy consumption and its computation complexity is a little bit high, nevertheless the number of beacon nodes is relatively smaller, so the calculation for the large number of unknown nodes is succinct, and this method can obviously decrease the energy consumption and time cost of localizing these mobile nodes. The simulation results indicate that this method has higher localization accuracy and better localization coverage rate compared with some other widely used localization methods in this field.
Optimization the Initial Weights of Artificial Neural Networks via Genetic Algorithm Applied to Hip Bone Fracture Prediction

Directory of Open Access Journals (Sweden)

Yu-Tzu Chang

2012-01-01

Full Text Available This paper aims to find the optimal set of initial weights to enhance the accuracy of artificial neural networks (ANNs by using genetic algorithms (GA. The sample in this study included 228 patients with first low-trauma hip fracture and 215 patients without hip fracture, both of them were interviewed with 78 questions. We used logistic regression to select 5 important factors (i.e., bone mineral density, experience of fracture, average hand grip strength, intake of coffee, and peak expiratory flow rate for building artificial neural networks to predict the probabilities of hip fractures. Three-layer (one hidden layer ANNs models with back-propagation training algorithms were adopted. The purpose in this paper is to find the optimal initial weights of neural networks via genetic algorithm to improve the predictability. Area under the ROC curve (AUC was used to assess the performance of neural networks. The study results showed the genetic algorithm obtained an AUC of 0.858±0.00493 on modeling data and 0.802 ± 0.03318 on testing data. They were slightly better than the results of our previous study (0.868±0.00387 and 0.796±0.02559, resp.. Thus, the preliminary study for only using simple GA has been proved to be effective for improving the accuracy of artificial neural networks.
Use of Artificial Intelligence and Machine Learning Algorithms with Gene Expression Profiling to Predict Recurrent Nonmuscle Invasive Urothelial Carcinoma of the Bladder.

Science.gov (United States)

Bartsch, Georg; Mitra, Anirban P; Mitra, Sheetal A; Almal, Arpit A; Steven, Kenneth E; Skinner, Donald G; Fry, David W; Lenehan, Peter F; Worzel, William P; Cote, Richard J

2016-02-01

Due to the high recurrence risk of nonmuscle invasive urothelial carcinoma it is crucial to distinguish patients at high risk from those with indolent disease. In this study we used a machine learning algorithm to identify the genes in patients with nonmuscle invasive urothelial carcinoma at initial presentation that were most predictive of recurrence. We used the genes in a molecular signature to predict recurrence risk within 5 years after transurethral resection of bladder tumor. Whole genome profiling was performed on 112 frozen nonmuscle invasive urothelial carcinoma specimens obtained at first presentation on Human WG-6 BeadChips (Illumina®). A genetic programming algorithm was applied to evolve classifier mathematical models for outcome prediction. Cross-validation based resampling and gene use frequencies were used to identify the most prognostic genes, which were combined into rules used in a voting algorithm to predict the sample target class. Key genes were validated by quantitative polymerase chain reaction. The classifier set included 21 genes that predicted recurrence. Quantitative polymerase chain reaction was done for these genes in a subset of 100 patients. A 5-gene combined rule incorporating a voting algorithm yielded 77% sensitivity and 85% specificity to predict recurrence in the training set, and 69% and 62%, respectively, in the test set. A singular 3-gene rule was constructed that predicted recurrence with 80% sensitivity and 90% specificity in the training set, and 71% and 67%, respectively, in the test set. Using primary nonmuscle invasive urothelial carcinoma from initial occurrences genetic programming identified transcripts in reproducible fashion, which were predictive of recurrence. These findings could potentially impact nonmuscle invasive urothelial carcinoma management. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
A Clinical Prediction Algorithm to Stratify Pediatric Musculoskeletal Infection by Severity

Science.gov (United States)

Benvenuti, Michael A; An, Thomas J; Mignemi, Megan E; Martus, Jeffrey E; Mencio, Gregory A; Lovejoy, Stephen A; Thomsen, Isaac P; Schoenecker, Jonathan G; Williams, Derek J

2016-01-01

Objective There are currently no algorithms for early stratification of pediatric musculoskeletal infection (MSKI) severity that are applicable to all types of tissue involvement. In this study, the authors sought to develop a clinical prediction algorithm that accurately stratifies infection severity based on clinical and laboratory data at presentation to the emergency department. Methods An IRB-approved retrospective review was conducted to identify patients aged 0–18 who presented to the pediatric emergency department at a tertiary care children’s hospital with concern for acute MSKI over a five-year period (2008–2013). Qualifying records were reviewed to obtain clinical and laboratory data and to classify in-hospital outcomes using a three-tiered severity stratification system. Ordinal regression was used to estimate risk for each outcome. Candidate predictors included age, temperature, respiratory rate, heart rate, C-reactive protein, and peripheral white blood cell count. We fit fully specified (all predictors) and reduced models (retaining predictors with a p-value ≤ 0.2). Discriminatory power of the models was assessed using the concordance (c)-index. Results Of the 273 identified children, 191 (70%) met inclusion criteria. Median age was 5.8 years. Outcomes included 47 (25%) children with inflammation only, 41 (21%) with local infection, and 103 (54%) with disseminated infection. Both the full and reduced models accurately demonstrated excellent performance (full model c-index 0.83, 95% CI [0.79–0.88]; reduced model 0.83, 95% CI [0.78–0.87]). Model fit was also similar, indicating preference for the reduced model. Variables in this model included C-reactive protein, pulse, temperature, and an interaction term for pulse and temperature. The odds of a more severe outcome increased by 30% for every 10-unit increase in C-reactive protein. Conclusions Clinical and laboratory data obtained in the emergency department may be used to accurately
Predicting Post-Translational Modifications from Local Sequence Fragments Using Machine Learning Algorithms: Overview and Best Practices.

Science.gov (United States)

Tatjewski, Marcin; Kierczak, Marcin; Plewczynski, Dariusz

2017-01-01

Here, we present two perspectives on the task of predicting post translational modifications (PTMs) from local sequence fragments using machine learning algorithms. The first is the description of the fundamental steps required to construct a PTM predictor from the very beginning. These steps include data gathering, feature extraction, or machine-learning classifier selection. The second part of our work contains the detailed discussion of more advanced problems which are encountered in PTM prediction task. Probably the most challenging issues which we have covered here are: (1) how to address the training data class imbalance problem (we also present statistics describing the problem); (2) how to properly set up cross-validation folds with an approach which takes into account the homology of protein data records, to address this problem we present our folds-over-clusters algorithm; and (3) how to efficiently reach for new sources of learning features. Presented techniques and notes resulted from intense studies in the field, performed by our and other groups, and can be useful both for researchers beginning in the field of PTM prediction and for those who want to extend the repertoire of their research techniques.
Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm

Science.gov (United States)

Heidari, Morteza; Zargari Khuzani, Abolfazl; Hollingsworth, Alan B.; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qiu, Yuchen; Liu, Hong; Zheng, Bin

2018-02-01

In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.
Predicting Sepsis Risk Using the "Sniffer" Algorithm in the Electronic Medical Record.

Science.gov (United States)

Olenick, Evelyn M; Zimbro, Kathie S; DʼLima, Gabrielle M; Ver Schneider, Patricia; Jones, Danielle

The Sepsis "Sniffer" Algorithm (SSA) has merit as a digital sepsis alert but should be considered an adjunct to versus an alternative for the Nurse Screening Tool (NST), given lower specificity and positive predictive value. The SSA reduced the risk of incorrectly categorizing patients at low risk for sepsis, detected sepsis high risk in half the time, and reduced redundant NST screens by 70% and manual screening hours by 64% to 72%. Preserving nurse hours expended on manual sepsis alerts may translate into time directed toward other patient priorities.
Earthquake Prediction Analysis Based on Empirical Seismic Rate: The M8 Algorithm

International Nuclear Information System (INIS)

Molchan, G.; Romashkova, L.

2010-07-01

The quality of space-time earthquake prediction is usually characterized by a two-dimensional error diagram (n,τ), where n is the rate of failures-to-predict and τ is the normalized measure of space-time alarm. The most reasonable space measure for analysis of a prediction strategy is the rate of target events λ(dg) in a sub-area dg. In that case the quantity H = 1-(n +τ) determines the prediction capability of the strategy. The uncertainty of λ(dg) causes difficulties in estimating H and the statistical significance, α, of prediction results. We investigate this problem theoretically and show how the uncertainty of the measure can be taken into account in two situations, viz., the estimation of α and the construction of a confidence zone for the (n,τ)-parameters of the random strategies. We use our approach to analyse the results from prediction of M ≥ 8.0 events by the M8 method for the period 1985-2009 (the M8.0+ test). The model of λ(dg) based on the events Mw ≥ 5.5, 1977-2004, and the magnitude range of target events 8.0 ≤ M < 8.5 are considered as basic to this M8 analysis. We find the point and upper estimates of α and show that they are still unstable because the number of target events in the experiment is small. However, our results argue in favour of non-triviality of the M8 prediction algorithm. (author)
LinkMind: Link Optimization in Swarming Mobile Sensor Networks

DEFF Research Database (Denmark)

Ngo, Trung Dung

2012-01-01

of the most advantageous properties of the swarming wireless sensor network is that mobile nodes can work cooperatively to organize an ad-hoc network and optimize the network link capacity to maximize the transmission of gathered data from a source to a target. This paper describes a new method of link...... optimization of swarming mobile sensor networks. The new method is based on combination of the artificial potential force guaranteeing connectivities of the mobile sensor nodes and the max-flow min-cut theorem of graph theory ensuring optimization of the network link capacity. The developed algorithm...
HUMAN DECISIONS AND MACHINE PREDICTIONS.

Science.gov (United States)

Kleinberg, Jon; Lakkaraju, Himabindu; Leskovec, Jure; Ludwig, Jens; Mullainathan, Sendhil

2018-02-01

Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; and these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. JEL Codes: C10 (Econometric and statistical methods and methodology), C55 (Large datasets: Modeling and analysis), K40 (Legal procedure, the legal system, and illegal behavior).

Predictive Control of Hydronic Floor Heating Systems using Neural Networks and Genetic Algorithms

DEFF Research Database (Denmark)

Vinther, Kasper; Green, Torben; Østergaard, Søren

2017-01-01

This paper presents the use a neural network and a micro genetic algorithm to optimize future set-points in existing hydronic floor heating systems for improved energy efficiency. The neural network can be trained to predict the impact of changes in set-points on future room temperatures. Additio...... space is not guaranteed. Evaluation of the performance of multiple neural networks is performed, using different levels of information, and optimization results are presented on a detailed house simulation model....
Secondary Structure Prediction of Protein using Resilient Back Propagation Learning Algorithm

Directory of Open Access Journals (Sweden)

Jyotshna Dongardive

2015-12-01

Full Text Available The paper proposes a neural network based approach to predict secondary structure of protein. It uses Multilayer Feed Forward Network (MLFN with resilient back propagation as the learning algorithm. Point Accepted Mutation (PAM is adopted as the encoding scheme and CB396 data set is used for the training and testing of the network. Overall accuracy of the network has been experimentally calculated with different window sizes for the sliding window scheme and by varying the number of units in the hidden layer. The best results were obtained with eleven as the window size and seven as the number of units in the hidden layer.
PREDICTIVE CONTROL OF A BATCH POLYMERIZATION SYSTEM USING A FEEDFORWARD NEURAL NETWORK WITH ONLINE ADAPTATION BY GENETIC ALGORITHM

OpenAIRE

Cancelier, A.; Claumann, C. A.; Bolzan, A.; Machado, R. A. F.

2016-01-01

Abstract This study used a predictive controller based on an empirical nonlinear model comprising a three-layer feedforward neural network for temperature control of the suspension polymerization process. In addition to the offline training technique, an algorithm was also analyzed for online adaptation of its parameters. For the offline training, the network was statically trained and the genetic algorithm technique was used in combination with the least squares method. For online training, ...
Predicting the Occurrence of Haze Events in Southeast Asia using Machine Learning Algorithms

Science.gov (United States)

Lee, H. H.; Chulakadabba, A.; Tonks, A.; Yang, Z.; Wang, C.

2017-12-01

Severe local- and regional-scale air pollution episodes typically originate from 1) high emissions of air pollutants, 2) poor dispersion conditions, and 3) trans-boundary pollutant transport. Biomass burning activities have become more frequent in Southeast Asia, especially in Sumatra, Borneo, and the mainland Southeast. Trans-boundary transport of biomass burning aerosols often lead to air quality problems in the region. Furthermore, particulate pollutants from human activities besides biomass burning also play an important role in the air quality of Southeast Asia. Singapore, for example, has a dynamic industrial sector including chemical, electric and metallurgic industries, and is the region's major petroleum-refining center. In addition, natural gas and oil power plants, waste incinerators, active port traffic, and a major regional airport further complicate Singapore's air quality issues. In this study, we compare five Machine Learning algorithms: k-Nearest Neighbors, Linear Support Vector Machine, Decision Tree, Random Forest and Artificial Neural Network, to identify haze patterns and determine variable importance. The algorithms were trained using local atmospheric data (i.e. months, atmospheric conditions, wind direction and relative humidity) from three observation stations in Singapore (Changi, Seletar and Paya Labar). We find that the algorithms reveal the associations in data within and between the stations, and provide in-depth interpretation of the haze sources. The algorithms also allow us to predict the probability of haze episodes in Singapore and to determine the correlation between this probability and atmospheric conditions.
PREDICTION OF WATER QUALITY INDEX USING BACK PROPAGATION NETWORK ALGORITHM. CASE STUDY: GOMBAK RIVER

Directory of Open Access Journals (Sweden)

FARIS GORASHI

2012-08-01

Full Text Available The aim of this study is to enable prediction of water quality parameters with conjunction to land use attributes and to find a low-end alternative for water quality monitoring techniques, which are typically expensive and tedious. It also aims to ensure sustainable development, which is essentially has effects on water quality. The research approach followed in this study is via using artificial neural networks, and geographical information system to provide a reliable prediction model. Back propagation network algorithm was used for the purpose of this study. The proposed approach minimized most of anomalies associated with prediction methods and provided water quality prediction with precision. The study used 5 hidden nodes in this network. The network was optimized to complete 23145 cycles before it reaches the best error of 0.65. Stations 18 had shown the greatest fluctuation among the three stations as it reflects an area of on-going rapid development of Gombak river watershed. The results had shown a very close prediction with best error of 0.67 in a sensitivity test that was carried afterwards.
A conservative fully implicit algorithm for predicting slug flows

Science.gov (United States)

Krasnopolsky, Boris I.; Lukyanov, Alexander A.

2018-02-01

An accurate and predictive modelling of slug flows is required by many industries (e.g., oil and gas, nuclear engineering, chemical engineering) to prevent undesired events potentially leading to serious environmental accidents. For example, the hydrodynamic and terrain-induced slugging leads to unwanted unsteady flow conditions. This demands the development of fast and robust numerical techniques for predicting slug flows. The presented in this paper study proposes a multi-fluid model and its implementation method accounting for phase appearance and disappearance. The numerical modelling of phase appearance and disappearance presents a complex numerical challenge for all multi-component and multi-fluid models. Numerical challenges arise from the singular systems of equations when some phases are absent and from the solution discontinuity when some phases appear or disappear. This paper provides a flexible and robust solution to these issues. A fully implicit formulation described in this work enables to efficiently solve governing fluid flow equations. The proposed numerical method provides a modelling capability of phase appearance and disappearance processes, which is based on switching procedure between various sets of governing equations. These sets of equations are constructed using information about the number of phases present in the computational domain. The proposed scheme does not require an explicit truncation of solutions leading to a conservative scheme for mass and linear momentum. A transient two-fluid model is used to verify and validate the proposed algorithm for conditions of hydrodynamic and terrain-induced slug flow regimes. The developed modelling capabilities allow to predict all the major features of the experimental data, and are in a good quantitative agreement with them.
How Are Mate Preferences Linked with Actual Mate Selection? Tests of Mate Preference Integration Algorithms Using Computer Simulations and Actual Mating Couples.

Science.gov (United States)

Conroy-Beam, Daniel; Buss, David M

2016-01-01

Prior mate preference research has focused on the content of mate preferences. Yet in real life, people must select mates among potentials who vary along myriad dimensions. How do people incorporate information on many different mate preferences in order to choose which partner to pursue? Here, in Study 1, we compare seven candidate algorithms for integrating multiple mate preferences in a competitive agent-based model of human mate choice evolution. This model shows that a Euclidean algorithm is the most evolvable solution to the problem of selecting fitness-beneficial mates. Next, across three studies of actual couples (Study 2: n = 214; Study 3: n = 259; Study 4: n = 294) we apply the Euclidean algorithm toward predicting mate preference fulfillment overall and preference fulfillment as a function of mate value. Consistent with the hypothesis that mate preferences are integrated according to a Euclidean algorithm, we find that actual mates lie close in multidimensional preference space to the preferences of their partners. Moreover, this Euclidean preference fulfillment is greater for people who are higher in mate value, highlighting theoretically-predictable individual differences in who gets what they want. These new Euclidean tools have important implications for understanding real-world dynamics of mate selection.
How Are Mate Preferences Linked with Actual Mate Selection? Tests of Mate Preference Integration Algorithms Using Computer Simulations and Actual Mating Couples.

Directory of Open Access Journals (Sweden)

Daniel Conroy-Beam

Full Text Available Prior mate preference research has focused on the content of mate preferences. Yet in real life, people must select mates among potentials who vary along myriad dimensions. How do people incorporate information on many different mate preferences in order to choose which partner to pursue? Here, in Study 1, we compare seven candidate algorithms for integrating multiple mate preferences in a competitive agent-based model of human mate choice evolution. This model shows that a Euclidean algorithm is the most evolvable solution to the problem of selecting fitness-beneficial mates. Next, across three studies of actual couples (Study 2: n = 214; Study 3: n = 259; Study 4: n = 294 we apply the Euclidean algorithm toward predicting mate preference fulfillment overall and preference fulfillment as a function of mate value. Consistent with the hypothesis that mate preferences are integrated according to a Euclidean algorithm, we find that actual mates lie close in multidimensional preference space to the preferences of their partners. Moreover, this Euclidean preference fulfillment is greater for people who are higher in mate value, highlighting theoretically-predictable individual differences in who gets what they want. These new Euclidean tools have important implications for understanding real-world dynamics of mate selection.
[Development and validation of an algorithm to identify cancer recurrences from hospital data bases].

Science.gov (United States)

Manzanares-Laya, S; Burón, A; Murta-Nascimento, C; Servitja, S; Castells, X; Macià, F

2014-01-01

Hospital cancer registries and hospital databases are valuable and efficient sources of information for research into cancer recurrences. The aim of this study was to develop and validate algorithms for the detection of breast cancer recurrence. A retrospective observational study was conducted on breast cancer cases from the cancer registry of a third level university hospital diagnosed between 2003 and 2009. Different probable cancer recurrence algorithms were obtained by linking the hospital databases and the construction of several operational definitions, with their corresponding sensitivity, specificity, positive predictive value and negative predictive value. A total of 1,523 patients were diagnosed of breast cancer between 2003 and 2009. A request for bone gammagraphy after 6 months from the first oncological treatment showed the highest sensitivity (53.8%) and negative predictive value (93.8%), and a pathology test after 6 months after the diagnosis showed the highest specificity (93.8%) and negative predictive value (92.6%). The combination of different definitions increased the specificity and the positive predictive value, but decreased the sensitivity. Several diagnostic algorithms were obtained, and the different definitions could be useful depending on the interest and resources of the researcher. A higher positive predictive value could be interesting for a quick estimation of the number of cases, and a higher negative predictive value for a more exact estimation if more resources are available. It is a versatile and adaptable tool for other types of tumors, as well as for the needs of the researcher. Copyright © 2014 SECA. Published by Elsevier Espana. All rights reserved.
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM

International Nuclear Information System (INIS)

Bobra, M. G.; Couvidat, S.

2015-01-01

We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities
Adaptive laser link reconfiguration using constraint propagation

Science.gov (United States)

Crone, M. S.; Julich, P. M.; Cook, L. M.

1993-01-01

This paper describes Harris AI research performed on the Adaptive Link Reconfiguration (ALR) study for Rome Lab, and focuses on the application of constraint propagation to the problem of link reconfiguration for the proposed space based Strategic Defense System (SDS) Brilliant Pebbles (BP) communications system. According to the concept of operations at the time of the study, laser communications will exist between BP's and to ground entry points. Long-term links typical of RF transmission will not exist. This study addressed an initial implementation of BP's based on the Global Protection Against Limited Strikes (GPALS) SDI mission. The number of satellites and rings studied was representative of this problem. An orbital dynamics program was used to generate line-of-site data for the modeled architecture. This was input into a discrete event simulation implemented in the Harris developed COnstraint Propagation Expert System (COPES) Shell, developed initially on the Rome Lab BM/C3 study. Using a model of the network and several heuristics, the COPES shell was used to develop the Heuristic Adaptive Link Ordering (HALO) Algorithm to rank and order potential laser links according to probability of communication. A reduced set of links based on this ranking would then be used by a routing algorithm to select the next hop. This paper includes an overview of Constraint Propagation as an Artificial Intelligence technique and its embodiment in the COPES shell. It describes the design and implementation of both the simulation of the GPALS BP network and the HALO algorithm in COPES. This is described using a 59 Data Flow Diagram, State Transition Diagrams, and Structured English PDL. It describes a laser communications model and the heuristics involved in rank-ordering the potential communication links. The generation of simulation data is described along with its interface via COPES to the Harris developed View Net graphical tool for visual analysis of communications
The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents

Directory of Open Access Journals (Sweden)

Ziad Salem

2014-12-01

Full Text Available Learning is the act of obtaining new or modifying existing knowledge, behaviours, skills or preferences. The ability to learn is found in humans, other organisms and some machines. Learning is always based on some sort of observations or data such as examples, direct experience or instruction. This paper presents a classification algorithm to learn the density of agents in an arena based on the measurements of six proximity sensors of a combined actuator sensor units (CASUs. Rules are presented that were induced by the learning algorithm that was trained with data-sets based on the CASU’s sensor data streams collected during a number of experiments with “Bristlebots (agents in the arena (environment”. It was found that a set of rules generated by the learning algorithm is able to predict the number of bristlebots in the arena based on the CASU’s sensor readings with satisfying accuracy.
Development of Predictive QSAR Models of 4-Thiazolidinones Antitrypanosomal Activity using Modern Machine Learning Algorithms.

Science.gov (United States)

Kryshchyshyn, Anna; Devinyak, Oleg; Kaminskyy, Danylo; Grellier, Philippe; Lesyk, Roman

2017-11-14

This paper presents novel QSAR models for the prediction of antitrypanosomal activity among thiazolidines and related heterocycles. The performance of four machine learning algorithms: Random Forest regression, Stochastic gradient boosting, Multivariate adaptive regression splines and Gaussian processes regression have been studied in order to reach better levels of predictivity. The results for Random Forest and Gaussian processes regression are comparable and outperform other studied methods. The preliminary descriptor selection with Boruta method improved the outcome of machine learning methods. The two novel QSAR-models developed with Random Forest and Gaussian processes regression algorithms have good predictive ability, which was proved by the external evaluation of the test set with corresponding Q 2 ext =0.812 and Q 2 ext =0.830. The obtained models can be used further for in silico screening of virtual libraries in the same chemical domain in order to find new antitrypanosomal agents. Thorough analysis of descriptors influence in the QSAR models and interpretation of their chemical meaning allows to highlight a number of structure-activity relationships. The presence of phenyl rings with electron-withdrawing atoms or groups in para-position, increased number of aromatic rings, high branching but short chains, high HOMO energy, and the introduction of 1-substituted 2-indolyl fragment into the molecular structure have been recognized as trypanocidal activity prerequisites. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Verification and improvement of predictive algorithms for radionuclide migration

International Nuclear Information System (INIS)

Carnahan, C.L.; Miller, C.W.; Remer, J.S.

1984-01-01

This research addresses issues relevant to numerical simulation and prediction of migration of radionuclides in the environment of nuclear waste repositories. Specific issues investigated are the adequacy of current numerical codes in simulating geochemical interactions affecting radionuclide migration, the level of complexity required in chemical algorithms of transport models, and the validity of the constant-k/sub D/ concept in chemical transport modeling. An initial survey of the literature led to the conclusion that existing numerical codes did not encompass the full range of chemical and physical phenomena influential in radionuclide migration. Studies of chemical algorithms have been conducted within the framework of a one-dimensional numerical code that simulates the transport of chemically reacting solutes in a saturated porous medium. The code treats transport by dispersion/diffusion and advection, and equilibrium-controlled proceses of interphase mass transfer, complexation in the aqueous phase, pH variation, and precipitation/dissolution of secondary solids. Irreversible, time-dependent dissolution of solid phases during transport can be treated. Mass action, transport, and sorptive site constraint equations are expressed in differential/algebraic form and are solved simultaneously. Simulations using the code show that use of the constant-k/sub D/ concept can produce unreliable results in geochemical transport modeling. Applications to a field test and laboratory analogs of a nuclear waste repository indicate that a thermodynamically based simulator of chemical transport can successfully mimic real processes provided that operative chemical mechanisms and associated data have been correctly identified and measured, and have been incorporated in the simulator. 17 references, 10 figures
Reliable prediction of adsorption isotherms via genetic algorithm molecular simulation.

Science.gov (United States)

LoftiKatooli, L; Shahsavand, A

2017-01-01

Conventional molecular simulation techniques such as grand canonical Monte Carlo (GCMC) strictly rely on purely random search inside the simulation box for predicting the adsorption isotherms. This blind search is usually extremely time demanding for providing a faithful approximation of the real isotherm and in some cases may lead to non-optimal solutions. A novel approach is presented in this article which does not use any of the classical steps of the standard GCMC method, such as displacement, insertation, and removal. The new approach is based on the well-known genetic algorithm to find the optimal configuration for adsorption of any adsorbate on a structured adsorbent under prevailing pressure and temperature. The proposed approach considers the molecular simulation problem as a global optimization challenge. A detailed flow chart of our so-called genetic algorithm molecular simulation (GAMS) method is presented, which is entirely different from traditions molecular simulation approaches. Three real case studies (for adsorption of CO 2 and H 2 over various zeolites) are borrowed from literature to clearly illustrate the superior performances of the proposed method over the standard GCMC technique. For the present method, the average absolute values of percentage errors are around 11% (RHO-H 2 ), 5% (CHA-CO 2 ), and 16% (BEA-CO 2 ), while they were about 70%, 15%, and 40% for the standard GCMC technique, respectively.
Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations.

Science.gov (United States)

Zong, Nansu; Kim, Hyeoneui; Ngo, Victoria; Harismendy, Olivier

2017-08-01

A heterogeneous network topology possessing abundant interactions between biomedical entities has yet to be utilized in similarity-based methods for predicting drug-target associations based on the array of varying features of drugs and their targets. Deep learning reveals features of vertices of a large network that can be adapted in accommodating the similarity-based solutions to provide a flexible method of drug-target prediction. We propose a similarity-based drug-target prediction method that enhances existing association discovery methods by using a topology-based similarity measure. DeepWalk, a deep learning method, is adopted in this study to calculate the similarities within Linked Tripartite Network (LTN), a heterogeneous network generated from biomedical linked datasets. This proposed method shows promising results for drug-target association prediction: 98.96% AUC ROC score with a 10-fold cross-validation and 99.25% AUC ROC score with a Monte Carlo cross-validation with LTN. By utilizing DeepWalk, we demonstrate that: (i) this method outperforms other existing topology-based similarity computation methods, (ii) the performance is better for tripartite than with bipartite networks and (iii) the measure of similarity using network topology outperforms the ones derived from chemical structure (drugs) or genomic sequence (targets). Our proposed methodology proves to be capable of providing a promising solution for drug-target prediction based on topological similarity with a heterogeneous network, and may be readily re-purposed and adapted in the existing of similarity-based methodologies. The proposed method has been developed in JAVA and it is available, along with the data at the following URL: https://github.com/zongnansu1982/drug-target-prediction . nazong@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Protein docking prediction using predicted protein-protein interface

Directory of Open Access Journals (Sweden)

Li Bin

2012-01-01

Full Text Available Abstract Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm, is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Protein docking prediction using predicted protein-protein interface.

Science.gov (United States)

Li, Bin; Kihara, Daisuke

2012-01-10

Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.

Science.gov (United States)

Bai, Li-Yue; Dai, Hao; Xu, Qin; Junaid, Muhammad; Peng, Shao-Liang; Zhu, Xiaolei; Xiong, Yi; Wei, Dong-Qing

2018-02-05

Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.
Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm

Directory of Open Access Journals (Sweden)

Li-Yue Bai

2018-02-01

Full Text Available Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.

Predicting the severity of nuclear power plant transients using nearest neighbors modeling optimized by genetic algorithms on a parallel computer

International Nuclear Information System (INIS)

Lin, J.; Bartal, Y.; Uhrig, R.E.

1995-01-01

The importance of automatic diagnostic systems for nuclear power plants (NPPs) has been discussed in numerous studies, and various such systems have been proposed. None of those systems were designed to predict the severity of the diagnosed scenario. A classification and severity prediction system for NPP transients is developed. The system is based on nearest neighbors modeling, which is optimized using genetic algorithms. The optimization process is used to determine the most important variables for each of the transient types analyzed. An enhanced version of the genetic algorithms is used in which a local downhill search is performed to further increase the accuracy achieved. The genetic algorithms search was implemented on a massively parallel supercomputer, the KSR1-64, to perform the analysis in a reasonable time. The data for this study were supplied by the high-fidelity simulator of the San Onofre unit 1 pressurized water reactor
Evaluation of machine learning algorithms for prediction of regions of high Reynolds averaged Navier Stokes uncertainty

Science.gov (United States)

Ling, J.; Templeton, J.

2015-08-01

Reynolds Averaged Navier Stokes (RANS) models are widely used in industry to predict fluid flows, despite their acknowledged deficiencies. Not only do RANS models often produce inaccurate flow predictions, but there are very limited diagnostics available to assess RANS accuracy for a given flow configuration. If experimental or higher fidelity simulation results are not available for RANS validation, there is no reliable method to evaluate RANS accuracy. This paper explores the potential of utilizing machine learning algorithms to identify regions of high RANS uncertainty. Three different machine learning algorithms were evaluated: support vector machines, Adaboost decision trees, and random forests. The algorithms were trained on a database of canonical flow configurations for which validated direct numerical simulation or large eddy simulation results were available, and were used to classify RANS results on a point-by-point basis as having either high or low uncertainty, based on the breakdown of specific RANS modeling assumptions. Classifiers were developed for three different basic RANS eddy viscosity model assumptions: the isotropy of the eddy viscosity, the linearity of the Boussinesq hypothesis, and the non-negativity of the eddy viscosity. It is shown that these classifiers are able to generalize to flows substantially different from those on which they were trained. Feature selection techniques, model evaluation, and extrapolation detection are discussed in the context of turbulence modeling applications.
Application of genetic algorithm - multiple linear regressions to predict the activity of RSK inhibitors

Directory of Open Access Journals (Sweden)

Avval Zhila Mohajeri

2015-01-01

Full Text Available This paper deals with developing a linear quantitative structure-activity relationship (QSAR model for predicting the RSK inhibition activity of some new compounds. A dataset consisting of 62 pyrazino [1,2-α] indole, diazepino [1,2-α] indole, and imidazole derivatives with known inhibitory activities was used. Multiple linear regressions (MLR technique combined with the stepwise (SW and the genetic algorithm (GA methods as variable selection tools was employed. For more checking stability, robustness and predictability of the proposed models, internal and external validation techniques were used. Comparison of the results obtained, indicate that the GA-MLR model is superior to the SW-MLR model and that it isapplicable for designing novel RSK inhibitors.
A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

Directory of Open Access Journals (Sweden)

Daqing Zhang

2015-01-01

Full Text Available Blood-brain barrier (BBB is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration.
Prostate cancer prediction using the random forest algorithm that takes into account transrectal ultrasound findings, age, and serum levels of prostate-specific antigen.

Science.gov (United States)

Xiao, Li-Hong; Chen, Pei-Ran; Gou, Zhong-Ping; Li, Yong-Zhong; Li, Mei; Xiang, Liang-Cheng; Feng, Ping

2017-01-01

The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.
Observational study to calculate addictive risk to opioids: a validation study of a predictive algorithm to evaluate opioid use disorder

Directory of Open Access Journals (Sweden)

Brenton A

2017-05-01

Full Text Available Ashley Brenton,1 Steven Richeimer,2,3 Maneesh Sharma,4 Chee Lee,1 Svetlana Kantorovich,1 John Blanchard,1 Brian Meshkin1 1Proove Biosciences, Irvine, CA, 2Keck school of Medicine, University of Southern California, Los Angeles, CA, 3Departments of Anesthesiology and Psychiatry, University of Southern California, Los Angeles, CA, 4Interventional Pain Institute, Baltimore, MD, USA Background: Opioid abuse in chronic pain patients is a major public health issue, with rapidly increasing addiction rates and deaths from unintentional overdose more than quadrupling since 1999. Purpose: This study seeks to determine the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm incorporating phenotypic risk factors and neuroscience-associated single-nucleotide polymorphisms (SNPs. Patients and methods: The Proove Opioid Risk (POR algorithm determines the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm incorporating phenotypic risk factors and neuroscience-associated SNPs. In a validation study with 258 subjects with diagnosed opioid use disorder (OUD and 650 controls who reported using opioids, the POR successfully categorized patients at high and moderate risks of opioid misuse or abuse with 95.7% sensitivity. Regardless of changes in the prevalence of opioid misuse or abuse, the sensitivity of POR remained >95%. Conclusion: The POR correctly stratifies patients into low-, moderate-, and high-risk categories to appropriately identify patients at need for additional guidance, monitoring, or treatment changes. Keywords: opioid use disorder, addiction, personalized medicine, pharmacogenetics, genetic testing, predictive algorithm
Predicting the distribution of bed material accumulation using river network sediment budgets

Science.gov (United States)

Wilkinson, Scott N.; Prosser, Ian P.; Hughes, Andrew O.

2006-10-01

Assessing the spatial distribution of bed material accumulation in river networks is important for determining the impacts of erosion on downstream channel form and habitat and for planning erosion and sediment management. A model that constructs spatially distributed budgets of bed material sediment is developed to predict the locations of accumulation following land use change. For each link in the river network, GIS algorithms are used to predict bed material supply from gullies, river banks, and upstream tributaries and to compare total supply with transport capacity. The model is tested in the 29,000 km2 Murrumbidgee River catchment in southeast Australia. It correctly predicts the presence or absence of accumulation in 71% of river links, which is significantly better performance than previous models, which do not account for spatial variability in sediment supply and transport capacity. Representing transient sediment storage is important for predicting smaller accumulations. Bed material accumulation is predicted in 25% of the river network, indicating its importance as an environmental problem in Australia.
Does a Diagnostic Classification Algorithm Help to Predict the Course of Low Back Pain?

DEFF Research Database (Denmark)

Hartvigsen, Lisbeth; Kongsted, Alice; Vach, Werner

2018-01-01

). Objectives To investigate if a diagnostic classification algorithm is associated with activity limitation and LBP intensity at 2-week and 3-month follow up, and 1-year trajectories of LBP intensity, and if it improves prediction of outcome when added to a set of known predictors. Methods 934 consecutive......Study Design A prospective observational study. Background A diagnostic classification algorithm was developed by Petersen et al., consisting of 12 categories based on a standardized examination protocol with the primary purpose of identifying clinically homogeneous subgroups of low back pain (LBP...... adult patients, with new episodes of LBP, who were visiting chiropractic practices in primary care were categorized according to the Petersen classification. Outcomes were disability and pain intensity measured at 2 weeks and 3 months, and 1-year trajectories of LBP based on weekly responses to text...
Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure

Science.gov (United States)

Asencio-Cortés, G.; Morales-Esteban, A.; Shang, X.; Martínez-Álvarez, F.

2018-06-01

Earthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2 O library in R language and Amazon cloud infrastructure were been used, reporting very promising results.
A fast EM algorithm for BayesA-like prediction of genomic breeding values.

Directory of Open Access Journals (Sweden)

Xiaochen Sun

Full Text Available Prediction accuracies of estimated breeding values for economically important traits are expected to benefit from genomic information. Single nucleotide polymorphism (SNP panels used in genomic prediction are increasing in density, but the Markov Chain Monte Carlo (MCMC estimation of SNP effects can be quite time consuming or slow to converge when a large number of SNPs are fitted simultaneously in a linear mixed model. Here we present an EM algorithm (termed "fastBayesA" without MCMC. This fastBayesA approach treats the variances of SNP effects as missing data and uses a joint posterior mode of effects compared to the commonly used BayesA which bases predictions on posterior means of effects. In each EM iteration, SNP effects are predicted as a linear combination of best linear unbiased predictions of breeding values from a mixed linear animal model that incorporates a weighted marker-based realized relationship matrix. Method fastBayesA converges after a few iterations to a joint posterior mode of SNP effects under the BayesA model. When applied to simulated quantitative traits with a range of genetic architectures, fastBayesA is shown to predict GEBV as accurately as BayesA but with less computing effort per SNP than BayesA. Method fastBayesA can be used as a computationally efficient substitute for BayesA, especially when an increasing number of markers bring unreasonable computational burden or slow convergence to MCMC approaches.
Monte Carlo algorithms with absorbing Markov chains: Fast local algorithms for slow dynamics

International Nuclear Information System (INIS)

Novotny, M.A.

1995-01-01

A class of Monte Carlo algorithms which incorporate absorbing Markov chains is presented. In a particular limit, the lowest order of these algorithms reduces to the n-fold way algorithm. These algorithms are applied to study the escape from the metastable state in the two-dimensional square-lattice nearest-neighbor Ising ferromagnet in an unfavorable applied field, and the agreement with theoretical predictions is very good. It is demonstrated that the higher-order algorithms can be many orders of magnitude faster than either the traditional Monte Carlo or n-fold way algorithms
Positive predictive value of a register-based algorithm using the Danish National Registries to identify suicidal events.

Science.gov (United States)

Gasse, Christiane; Danielsen, Andreas Aalkjaer; Pedersen, Marianne Giørtz; Pedersen, Carsten Bøcker; Mors, Ole; Christensen, Jakob

2018-04-17

It is not possible to fully assess intention of self-harm and suicidal events using information from administrative databases. We conducted a validation study of intention of suicide attempts/self-harm contacts identified by a commonly applied Danish register-based algorithm (DK-algorithm) based on hospital discharge diagnosis and emergency room contacts. Of all 101 530 people identified with an incident suicide attempt/self-harm contact at Danish hospitals between 1995 and 2012 using the DK-algorithm, we selected a random sample of 475 people. We validated the DK-algorithm against medical records applying the definitions and terminology of the Columbia Classification Algorithm of Suicide Assessment of suicidal events, nonsuicidal events, and indeterminate or potentially suicidal events. We calculated positive predictive values (PPVs) of the DK-algorithm to identify suicidal events overall, by gender, age groups, and calendar time. We retrieved medical records for 357 (75%) people. The PPV of the DK-algorithm to identify suicidal events was 51.5% (95% CI: 46.4-56.7) overall, 42.7% (95% CI: 35.2-50.5) in males, and 58.5% (95% CI: 51.6-65.1) in females. The PPV varied further across age groups and calendar time. After excluding cases identified via the DK-algorithm by unspecific codes of intoxications and injury, the PPV improved slightly (56.8% [95% CI: 50.0-63.4]). The DK-algorithm can reliably identify self-harm with suicidal intention in 52% of the identified cases of suicide attempts/self-harm. The PPVs could be used for quantitative bias analysis and implemented as weights in future studies to estimate the proportion of suicidal events among cases identified via the DK-algorithm. Copyright © 2018 John Wiley & Sons, Ltd.
Real time implementation of a linear predictive coding algorithm on digital signal processor DSP32C

International Nuclear Information System (INIS)

Sheikh, N.M.; Usman, S.R.; Fatima, S.

2002-01-01

Pulse Code Modulation (PCM) has been widely used in speech coding. However, due to its high bit rate. PCM has severe limitations in application where high spectral efficiency is desired, for example, in mobile communication, CD quality broadcasting system etc. These limitation have motivated research in bit rate reduction techniques. Linear predictive coding (LPC) is one of the most powerful complex techniques for bit rate reduction. With the introduction of powerful digital signal processors (DSP) it is possible to implement the complex LPC algorithm in real time. In this paper we present a real time implementation of the LPC algorithm on AT and T's DSP32C at a sampling frequency of 8192 HZ. Application of the LPC algorithm on two speech signals is discussed. Using this implementation , a bit rate reduction of 1:3 is achieved for better than tool quality speech, while a reduction of 1.16 is possible for speech quality required in military applications. (author)
Predicting How Close Near-Earth Asteroids Will Come to Earth in the Next Five Years Using Only Kepler's Algorithm

National Research Council Canada - National Science Library

Wright, Melissa

1998-01-01

.... The goal of th is investigation was to see if using only Kepler's algorithm, which ignores the gravitational pull of other planets, our moon, and Jupiter, was sufficient to predict close encounters with Earth...
Hidden State Prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses

Directory of Open Access Journals (Sweden)

Jesse Robert Zaneveld

2014-08-01

Full Text Available Complex symbioses between animal or plant hosts and their associated microbiotas can involve thousands of species and millions of genes. Because of the number of interacting partners, it is often impractical to study all organisms or genes in these host-microbe symbioses individually. Yet new phylogenetic predictive methods can use the wealth of accumulated data on diverse model organisms to make inferences into the properties of less well-studied species and gene families. Predictive functional profiling methods use evolutionary models based on the properties of studied relatives to put bounds on the likely characteristics of an organism or gene that has not yet been studied in detail. These techniques have been applied to predict diverse features of host-associated microbial communities ranging from the enzymatic function of uncharacterized genes to the gene content of uncultured microorganisms. We consider these phylogenetically-informed predictive techniques from disparate fields as examples of a general class of algorithms for Hidden State Prediction (HSP, and argue that HSP methods have broad value in predicting organismal traits in a variety of contexts, including the study of complex host-microbe symbioses.
Hidden state prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses.

Science.gov (United States)

Zaneveld, Jesse R R; Thurber, Rebecca L V

2014-01-01

Complex symbioses between animal or plant hosts and their associated microbiotas can involve thousands of species and millions of genes. Because of the number of interacting partners, it is often impractical to study all organisms or genes in these host-microbe symbioses individually. Yet new phylogenetic predictive methods can use the wealth of accumulated data on diverse model organisms to make inferences into the properties of less well-studied species and gene families. Predictive functional profiling methods use evolutionary models based on the properties of studied relatives to put bounds on the likely characteristics of an organism or gene that has not yet been studied in detail. These techniques have been applied to predict diverse features of host-associated microbial communities ranging from the enzymatic function of uncharacterized genes to the gene content of uncultured microorganisms. We consider these phylogenetically informed predictive techniques from disparate fields as examples of a general class of algorithms for Hidden State Prediction (HSP), and argue that HSP methods have broad value in predicting organismal traits in a variety of contexts, including the study of complex host-microbe symbioses.
GCA-w Algorithms for Traffic Simulation

International Nuclear Information System (INIS)

Hoffmann, R.

2011-01-01

The GCA-w model (Global Cellular Automata with write access) is an extension of the GCA (Global Cellular Automata) model, which is based on the cellular automata model (CA). Whereas the CA model uses static links to local neighbors, the GCA model uses dynamic links to potentially global neighbors. The GCA-w model is a further extension that allows modifying the neighbors' states. Thereby, neighbors can dynamically be activated or deactivated. Algorithms can be described more concisely and may execute more efficiently because redundant computations can be avoided. Modeling traffic flow is a good example showing the usefulness of the GCA-w model. The Nagel-Schreckenberg algorithm for traffic simulation is first described as CA and GCA, and then transformed into the GCA-w model. This algorithm is '' exclusive-write '', meaning that no write conflicts have to be resolved. Furthermore, this algorithm is extended, allowing to deactivate and to activate cars stuck in a traffic jam in order to save computation time and energy. (author)
Prediction of Aerodynamic Coefficient using Genetic Algorithm Optimized Neural Network for Sparse Data

Science.gov (United States)

Rajkumar, T.; Bardina, Jorge; Clancy, Daniel (Technical Monitor)

2002-01-01

Wind tunnels use scale models to characterize aerodynamic coefficients, Wind tunnel testing can be slow and costly due to high personnel overhead and intensive power utilization. Although manual curve fitting can be done, it is highly efficient to use a neural network to define the complex relationship between variables. Numerical simulation of complex vehicles on the wide range of conditions required for flight simulation requires static and dynamic data. Static data at low Mach numbers and angles of attack may be obtained with simpler Euler codes. Static data of stalled vehicles where zones of flow separation are usually present at higher angles of attack require Navier-Stokes simulations which are costly due to the large processing time required to attain convergence. Preliminary dynamic data may be obtained with simpler methods based on correlations and vortex methods; however, accurate prediction of the dynamic coefficients requires complex and costly numerical simulations. A reliable and fast method of predicting complex aerodynamic coefficients for flight simulation I'S presented using a neural network. The training data for the neural network are derived from numerical simulations and wind-tunnel experiments. The aerodynamic coefficients are modeled as functions of the flow characteristics and the control surfaces of the vehicle. The basic coefficients of lift, drag and pitching moment are expressed as functions of angles of attack and Mach number. The modeled and training aerodynamic coefficients show good agreement. This method shows excellent potential for rapid development of aerodynamic models for flight simulation. Genetic Algorithms (GA) are used to optimize a previously built Artificial Neural Network (ANN) that reliably predicts aerodynamic coefficients. Results indicate that the GA provided an efficient method of optimizing the ANN model to predict aerodynamic coefficients. The reliability of the ANN using the GA includes prediction of aerodynamic
A Dantzig-Wolfe decomposition algorithm for linear economic model predictive control of dynamically decoupled subsystems

DEFF Research Database (Denmark)

Sokoler, Leo Emil; Standardi, Laura; Edlund, Kristian

2014-01-01

This paper presents a warm-started Dantzig–Wolfe decomposition algorithm tailored to economic model predictive control of dynamically decoupled subsystems. We formulate the constrained optimal control problem solved at each sampling instant as a linear program with state space constraints, input...... limits, input rate limits, and soft output limits. The objective function of the linear program is related directly to the cost of operating the subsystems, and the cost of violating the soft output constraints. Simulations for large-scale economic power dispatch problems show that the proposed algorithm...... is significantly faster than both state-of-the-art linear programming solvers, and a structure exploiting implementation of the alternating direction method of multipliers. It is also demonstrated that the control strategy presented in this paper can be tuned using a weighted ℓ1-regularization term...
Investigation of Diesel’s Residual Noise on Predictive Vehicles Noise Cancelling using LMS Adaptive Algorithm

Science.gov (United States)

Arttini Dwi Prasetyowati, Sri; Susanto, Adhi; Widihastuti, Ida

2017-04-01

Every noise problems require different solution. In this research, the noise that must be cancelled comes from roadway. Least Mean Square (LMS) adaptive is one of the algorithm that can be used to cancel that noise. Residual noise always appears and could not be erased completely. This research aims to know the characteristic of residual noise from vehicle’s noise and analysis so that it is no longer appearing as a problem. LMS algorithm was used to predict the vehicle’s noise and minimize the error. The distribution of the residual noise could be observed to determine the specificity of the residual noise. The statistic of the residual noise close to normal distribution with = 0,0435, = 1,13 and the autocorrelation of the residual noise forming impulse. As a conclusion the residual noise is insignificant.

Quantum-circuit model of Hamiltonian search algorithms

International Nuclear Information System (INIS)

Roland, Jeremie; Cerf, Nicolas J.

2003-01-01

We analyze three different quantum search algorithms, namely, the traditional circuit-based Grover's algorithm, its continuous-time analog by Hamiltonian evolution, and the quantum search by local adiabatic evolution. We show that these algorithms are closely related in the sense that they all perform a rotation, at a constant angular velocity, from a uniform superposition of all states to the solution state. This makes it possible to implement the two Hamiltonian-evolution algorithms on a conventional quantum circuit, while keeping the quadratic speedup of Grover's original algorithm. It also clarifies the link between the adiabatic search algorithm and Grover's algorithm
Microprocessor-controlled CAMAC data link module

International Nuclear Information System (INIS)

Potter, J.M.

1978-05-01

Communication between the central control computer and remote, satellite data-acquisition/control stations at the Clinton P. Anderson Meson Physics Facility (LAMPF) is presently accomplished through the use of CAMAC-based Data Link modules. With the advent of the microprocessor, a new philosophy for digital data communications has evolved. Data Link modules containing microprocessor controllers provide link management and communication network protocol through algorithms executed in the Data Link microprocessor. 13 figures
PREDICTIVE CONTROL OF A BATCH POLYMERIZATION SYSTEM USING A FEEDFORWARD NEURAL NETWORK WITH ONLINE ADAPTATION BY GENETIC ALGORITHM

Directory of Open Access Journals (Sweden)

A. Cancelier

Full Text Available Abstract This study used a predictive controller based on an empirical nonlinear model comprising a three-layer feedforward neural network for temperature control of the suspension polymerization process. In addition to the offline training technique, an algorithm was also analyzed for online adaptation of its parameters. For the offline training, the network was statically trained and the genetic algorithm technique was used in combination with the least squares method. For online training, the network was trained on a recurring basis and only the technique of genetic algorithms was used. In this case, only the weights and bias of the output layer neuron were modified, starting from the parameters obtained from the offline training. From the experimental results obtained in a pilot plant, a good performance was observed for the proposed control system, with superior performance for the control algorithm with online adaptation of the model, particularly with respect to the presence of off-set for the case of the fixed parameters model.
Soft Computing Methods for Disulfide Connectivity Prediction.

Science.gov (United States)

Márquez-Chamorro, Alfonso E; Aguilar-Ruiz, Jesús S

2015-01-01

The problem of protein structure prediction (PSP) is one of the main challenges in structural bioinformatics. To tackle this problem, PSP can be divided into several subproblems. One of these subproblems is the prediction of disulfide bonds. The disulfide connectivity prediction problem consists in identifying which nonadjacent cysteines would be cross-linked from all possible candidates. Determining the disulfide bond connectivity between the cysteines of a protein is desirable as a previous step of the 3D PSP, as the protein conformational search space is highly reduced. The most representative soft computing approaches for the disulfide bonds connectivity prediction problem of the last decade are summarized in this paper. Certain aspects, such as the different methodologies based on soft computing approaches (artificial neural network or support vector machine) or features of the algorithms, are used for the classification of these methods.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

Directory of Open Access Journals (Sweden)

Joeri Ruyssinck

Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
Dynamic Heat Supply Prediction Using Support Vector Regression Optimized by Particle Swarm Optimization Algorithm

Directory of Open Access Journals (Sweden)

Meiping Wang

2016-01-01

Full Text Available We developed an effective intelligent model to predict the dynamic heat supply of heat source. A hybrid forecasting method was proposed based on support vector regression (SVR model-optimized particle swarm optimization (PSO algorithms. Due to the interaction of meteorological conditions and the heating parameters of heating system, it is extremely difficult to forecast dynamic heat supply. Firstly, the correlations among heat supply and related influencing factors in the heating system were analyzed through the correlation analysis of statistical theory. Then, the SVR model was employed to forecast dynamic heat supply. In the model, the input variables were selected based on the correlation analysis and three crucial parameters, including the penalties factor, gamma of the kernel RBF, and insensitive loss function, were optimized by PSO algorithms. The optimized SVR model was compared with the basic SVR, optimized genetic algorithm-SVR (GA-SVR, and artificial neural network (ANN through six groups of experiment data from two heat sources. The results of the correlation coefficient analysis revealed the relationship between the influencing factors and the forecasted heat supply and determined the input variables. The performance of the PSO-SVR model is superior to those of the other three models. The PSO-SVR method is statistically robust and can be applied to practical heating system.
Implementation of Freeman-Wimley prediction algorithm in a web-based application for in silico identification of beta-barrel membrane proteins

OpenAIRE

José Antonio Agüero-Fernández; Lisandra Aguilar-Bultet; Yandy Abreu-Jorge; Agustín Lage-Castellanos; Yannier Estévez-Dieppa

2015-01-01

Beta-barrel type proteins play an important role in both, human and veterinary medicine. In particular, their localization on the bacterial surface, and their involvement in virulence mechanisms of pathogens, have turned them into an interesting target in studies to search for vaccine candidates. Recently, Freeman and Wimley developed a prediction algorithm based on the physicochemical properties of transmembrane beta-barrels proteins (TMBBs). Based on that algorithm, and using Grails, a web-...
Predicting the Retention Behavior of Specific O-Linked Glycopeptides.

Science.gov (United States)

Badgett, Majors J; Boyes, Barry; Orlando, Ron

2017-09-01

O -Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O -glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O - N -acetylgalactosamine ( O -GalNAc), O - N -acetylglucosamine ( O -GlcNAc), and O -fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications.
Lightweight link dimensioning using sFlow sampling

DEFF Research Database (Denmark)

de Oliviera Schmidt, Ricardo; Sadre, Ramin; Sperotto, Anna

2013-01-01

not be trivial in high-speed links. Aiming scalability, operators often deploy packet sampling on monitoring, but little is known how it affects link dimensioning. In this paper we assess the feasibility of lightweight link dimensioning using sFlow, which is a widely-deployed traffic monitoring tool. We...... implement sFlow sampling algorithm and use a previously proposed and validated dimensioning formula that needs traffic variance. We validate our approach using packet captures from real networks. Results show that the proposed procedure is successful for a range of sampling rates and that, due to randomness...... of sampling algorithm, the error introduced by scaling the traffic variance yields more conservative results that cope with short-term traffic fluctuations....
Online co-regularized algorithms

NARCIS (Netherlands)

Ruijter, T. de; Tsivtsivadze, E.; Heskes, T.

2012-01-01

We propose an online co-regularized learning algorithm for classification and regression tasks. We demonstrate that by sequentially co-regularizing prediction functions on unlabeled data points, our algorithm provides improved performance in comparison to supervised methods on several UCI benchmarks
Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms.

Science.gov (United States)

Schädler, Marc R; Warzybok, Anna; Kollmeier, Birger

2018-01-01

The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than -20 dB could not be predicted.
Optimal Parameter Selection for Support Vector Machine Based on Artificial Bee Colony Algorithm: A Case Study of Grid-Connected PV System Power Prediction.

Science.gov (United States)

Gao, Xiang-Ming; Yang, Shi-Feng; Pan, San-Bo

2017-01-01

Predicting the output power of photovoltaic system with nonstationarity and randomness, an output power prediction model for grid-connected PV systems is proposed based on empirical mode decomposition (EMD) and support vector machine (SVM) optimized with an artificial bee colony (ABC) algorithm. First, according to the weather forecast data sets on the prediction date, the time series data of output power on a similar day with 15-minute intervals are built. Second, the time series data of the output power are decomposed into a series of components, including some intrinsic mode components IMFn and a trend component Res, at different scales using EMD. The corresponding SVM prediction model is established for each IMF component and trend component, and the SVM model parameters are optimized with the artificial bee colony algorithm. Finally, the prediction results of each model are reconstructed, and the predicted values of the output power of the grid-connected PV system can be obtained. The prediction model is tested with actual data, and the results show that the power prediction model based on the EMD and ABC-SVM has a faster calculation speed and higher prediction accuracy than do the single SVM prediction model and the EMD-SVM prediction model without optimization.
Optimal Parameter Selection for Support Vector Machine Based on Artificial Bee Colony Algorithm: A Case Study of Grid-Connected PV System Power Prediction

Directory of Open Access Journals (Sweden)

Xiang-ming Gao

2017-01-01

Full Text Available Predicting the output power of photovoltaic system with nonstationarity and randomness, an output power prediction model for grid-connected PV systems is proposed based on empirical mode decomposition (EMD and support vector machine (SVM optimized with an artificial bee colony (ABC algorithm. First, according to the weather forecast data sets on the prediction date, the time series data of output power on a similar day with 15-minute intervals are built. Second, the time series data of the output power are decomposed into a series of components, including some intrinsic mode components IMFn and a trend component Res, at different scales using EMD. The corresponding SVM prediction model is established for each IMF component and trend component, and the SVM model parameters are optimized with the artificial bee colony algorithm. Finally, the prediction results of each model are reconstructed, and the predicted values of the output power of the grid-connected PV system can be obtained. The prediction model is tested with actual data, and the results show that the power prediction model based on the EMD and ABC-SVM has a faster calculation speed and higher prediction accuracy than do the single SVM prediction model and the EMD-SVM prediction model without optimization.
Enhancing Accuracy of Sediment Total Load Prediction Using Evolutionary Algorithms (Case Study: Gotoorchay River

Directory of Open Access Journals (Sweden)

K. Roshangar

2016-09-01

Full Text Available Introduction: Exact prediction of transported sediment rate by rivers in water resources projects is of utmost importance. Basically erosion and sediment transport process is one of the most complexes hydrodynamic. Although different studies have been developed on the application of intelligent models based on neural, they are not widely used because of lacking explicitness and complexity governing on choosing and architecting of proper network. In this study, a Genetic expression programming model (as an important branches of evolutionary algorithems for predicting of sediment load is selected and investigated as an intelligent approach along with other known classical and imperical methods such as Larsen´s equation, Engelund-Hansen´s equation and Bagnold´s equation. Materials and Methods: In this study, in order to improve explicit prediction of sediment load of Gotoorchay, located in Aras catchment, Northwestern Iran latitude: 38°24´33.3˝ and longitude: 44°46´13.2˝, genetic programming (GP and Genetic Algorithm (GA were applied. Moreover, the semi-empirical models for predicting of total sediment load and rating curve have been used. Finally all the methods were compared and the best ones were introduced. Two statistical measures were used to compare the performance of the different models, namely root mean square error (RMSE and determination coefficient (DC. RMSE and DC indicate the discrepancy between the observed and computed values. Results and Discussions: The statistical characteristics results obtained from the analysis of genetic programming method for both selected model groups indicated that the model 4 including the only discharge of the river, relative to other studied models had the highest DC and the least RMSE in the testing stage (DC= 0.907, RMSE= 0.067. Although there were several parameters applied in other models, these models were complicated and had weak results of prediction. Our results showed that the model 9
Prostate cancer prediction using the random forest algorithm that takes into account transrectal ultrasound findings, age, and serum levels of prostate-specific antigen

Directory of Open Access Journals (Sweden)

Li-Hong Xiao

2017-01-01

Full Text Available The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P < 0.001, as well as in all transrectal ultrasound characteristics (P < 0.05 except uneven echo (P = 0.609. The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.
Cascade of links in complex networks

Energy Technology Data Exchange (ETDEWEB)

Feng, Yeqian; Sun, Bihui [Department of Management Science, School of Government, Beijing Normal University, 100875 Beijing (China); Zeng, An, E-mail: anzeng@bnu.edu.cn [School of Systems Science, Beijing Normal University, 100875 Beijing (China)

2017-01-30

Cascading failure is an important process which has been widely used to model catastrophic events such as blackouts and financial crisis in real systems. However, so far most of the studies in the literature focus on the cascading process on nodes, leaving the possibility of link cascade overlooked. In many real cases, the catastrophic events are actually formed by the successive disappearance of links. Examples exist in the financial systems where the firms and banks (i.e. nodes) still exist but many financial trades (i.e. links) are gone during the crisis, and the air transportation systems where the airports (i.e. nodes) are still functional but many airlines (i.e. links) stop operating during bad weather. In this letter, we develop a link cascade model in complex networks. With this model, we find that both artificial and real networks tend to collapse even if a few links are initially attacked. However, the link cascading process can be effectively terminated by setting a few strong nodes in the network which do not respond to any link reduction. Finally, a simulated annealing algorithm is used to optimize the location of these strong nodes, which significantly improves the robustness of the networks against the link cascade. - Highlights: • We propose a link cascade model in complex networks. • Both artificial and real networks tend to collapse even if a few links are initially attacked. • The link cascading process can be effectively terminated by setting a few strong nodes. • A simulated annealing algorithm is used to optimize the location of these strong nodes.
Cascade of links in complex networks

International Nuclear Information System (INIS)

Feng, Yeqian; Sun, Bihui; Zeng, An

2017-01-01

Cascading failure is an important process which has been widely used to model catastrophic events such as blackouts and financial crisis in real systems. However, so far most of the studies in the literature focus on the cascading process on nodes, leaving the possibility of link cascade overlooked. In many real cases, the catastrophic events are actually formed by the successive disappearance of links. Examples exist in the financial systems where the firms and banks (i.e. nodes) still exist but many financial trades (i.e. links) are gone during the crisis, and the air transportation systems where the airports (i.e. nodes) are still functional but many airlines (i.e. links) stop operating during bad weather. In this letter, we develop a link cascade model in complex networks. With this model, we find that both artificial and real networks tend to collapse even if a few links are initially attacked. However, the link cascading process can be effectively terminated by setting a few strong nodes in the network which do not respond to any link reduction. Finally, a simulated annealing algorithm is used to optimize the location of these strong nodes, which significantly improves the robustness of the networks against the link cascade. - Highlights: • We propose a link cascade model in complex networks. • Both artificial and real networks tend to collapse even if a few links are initially attacked. • The link cascading process can be effectively terminated by setting a few strong nodes. • A simulated annealing algorithm is used to optimize the location of these strong nodes.
Prediction of Cancer Proteins by Integrating Protein Interaction, Domain Frequency, and Domain Interaction Data Using Machine Learning Algorithms

Directory of Open Access Journals (Sweden)

Chien-Hung Huang

2015-01-01

Full Text Available Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues’s method by employing the protein-protein interaction (PPI data, domain-domain interaction (DDI data, weighted domain frequency score (DFS, and cancer linker degree (CLD data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I using individual algorithm, (II combining algorithms, and (III combining the same classification types of algorithms. When compared with Aragues’s method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.
Small Body GN&C Research Report: A Robust Model Predictive Control Algorithm with Guaranteed Resolvability

Science.gov (United States)

Acikmese, Behcet A.; Carson, John M., III

2005-01-01

A robustly stabilizing MPC (model predictive control) algorithm for uncertain nonlinear systems is developed that guarantees the resolvability of the associated finite-horizon optimal control problem in a receding-horizon implementation. The control consists of two components; (i) feedforward, and (ii) feedback part. Feed-forward control is obtained by online solution of a finite-horizon optimal control problem for the nominal system dynamics. The feedback control policy is designed off-line based on a bound on the uncertainty in the system model. The entire controller is shown to be robustly stabilizing with a region of attraction composed of initial states for which the finite-horizon optimal control problem is feasible. The controller design for this algorithm is demonstrated on a class of systems with uncertain nonlinear terms that have norm-bounded derivatives, and derivatives in polytopes. An illustrative numerical example is also provided.
Experimental 2.5-Gb/s QPSK WDM phase-modulated radio-over-fiber link with digital demodulation by a K-means algorithm

DEFF Research Database (Denmark)

Guerrero Gonzalez, Neil; Zibar, Darko; Caballero Jambrina, Antonio

2010-01-01

Highest reported bit rate of 2.5 Gb/s for optically phase modulated radio-over-fiber (RoF) link, employing digital coherent detection, is demonstrated. Demodulation of 3$,times,$ 2.5 Gb/s quadrature phase-shift keying modulated wavelength-division-multiplexed RoF channels is achieved after 79 km ...... of transmission through deployed fiber. Error-free performance (bit-error rate corresponding to $10^{{-}4}$) is achieved using a digital coherent receiver in combination with a $K$-means algorithm for radio-frequency phase recovery....

Optimization of a predictive controller of a pressurized water reactor Xenon oscillation using the particle swarm optimization algorithm

International Nuclear Information System (INIS)

Medeiros, Jose Antonio Carlos Canedo; Machado, Marcelo Dornellas; Lima, Alan Miranda M. de; Schirru, Roberto

2007-01-01

Predictive control systems are control systems that use a model of the controlled system (plant), used to predict the future behavior of the plant allowing the establishment of an anticipative control based on a future condition of the plant, and an optimizer that, considering a future time horizon of the plant output and a recent horizon of the control action, determines the controller's outputs to optimize a performance index of the controlled plant. The predictive control system does not require analytical models of the plant; the model of predictor of the plant can be learned from historical data of operation of the plant. The optimizer of the predictive controller establishes the strategy of the control: the minimization of a performance index (objective function) is done so that the present and future control actions are computed in such a way to minimize the objective function. The control strategy, implemented by the optimizer, induces the formation of an optimal control mechanism whose effect is to reduce the stabilization time, the 'overshoot' and 'undershoot', minimize the control actuation so that a compromise among those objectives is attained. The optimizer of the predictive controller is usually implemented using gradient-based algorithms. In this work we use the Particle Swarm Optimization algorithm (PSO) in the optimizer component of a predictive controller applied in the control of the xenon oscillation of a pressurized water reactor (PWR). The PSO is a stochastic optimization technique applied in several disciplines, simple and capable of providing a global optimal for high complexity problems and difficult to be optimized, providing in many cases better results than those obtained by other conventional and/or other artificial optimization techniques. (author)
Multicontroller: an object programming approach to introduce advanced control algorithms for the GCS large scale project

CERN Document Server

Cabaret, S; Coppier, H; Rachid, A; Barillère, R; CERN. Geneva. IT Department

2007-01-01

The GCS (Gas Control System) project team at CERN uses a Model Driven Approach with a Framework - UNICOS (UNified Industrial COntrol System) - based on PLC (Programming Language Controller) and SCADA (Supervisory Control And Data Acquisition) technologies. The first' UNICOS versions were able to provide a PID (Proportional Integrative Derivative) controller whereas the Gas Systems required more advanced control strategies. The MultiController is a new UNICOS object which provides the following advanced control algorithms: Smith Predictor, PFC (Predictive Function Control), RST* and GPC (Global Predictive Control). Its design is based on a monolithic entity with a global structure definition which is able to capture the desired set of parameters of any specific control algorithm supported by the object. The SCADA system -- PVSS - supervises the MultiController operation. The PVSS interface provides users with supervision faceplate, in particular it links any MultiController with recipes: the GCS experts are ab...
Wind Power Grid Connected Capacity Prediction Using LSSVM Optimized by the Bat Algorithm

Directory of Open Access Journals (Sweden)

Qunli Wu

2015-12-01

Full Text Available Given the stochastic nature of wind, wind power grid-connected capacity prediction plays an essential role in coping with the challenge of balancing supply and demand. Accurate forecasting methods make enormous contribution to mapping wind power strategy, power dispatching and sustainable development of wind power industry. This study proposes a bat algorithm (BA–least squares support vector machine (LSSVM hybrid model to improve prediction performance. In order to select input of LSSVM effectively, Stationarity, Cointegration and Granger causality tests are conducted to examine the influence of installed capacity with different lags, and partial autocorrelation analysis is employed to investigate the inner relationship of grid-connected capacity. The parameters in LSSVM are optimized by BA to validate the learning ability and generalization of LSSVM. Multiple model sufficiency evaluation methods are utilized. The research results reveal that the accuracy improvement of the present approach can reach about 20% compared to other single or hybrid models.
Prediction Method for Rain Rate and Rain Propagation Attenuation for K-Band Satellite Communications Links in Tropical Areas

Directory of Open Access Journals (Sweden)

Baso Maruddani

2015-01-01

Full Text Available This paper deals with the prediction method using hidden Markov model (HMM for rain rate and rain propagation attenuation for K-band satellite communication link at tropical area. As is well known, the K-band frequency is susceptible of being affected by atmospheric condition, especially in rainy condition. The wavelength of K-band frequency which approaches to the size of rain droplet causes the signal strength is easily attenuated and absorbed by the rain droplet. In order to keep the quality of system performance for K-band satellite communication link, therefore a special attention has to be paid for rain rate and rain propagation attenuation. Thus, a prediction method for rain rate and rain propagation attenuation based on HMM is developed to process the measurement data. The measured and predicted data are then compared with the ITU-R recommendation. From the result, it is shown that the measured and predicted data show similarity with the model of ITU-R P.837-5 recommendation for rain rate and the model of ITU-R P.618-10 recommendation for rain propagation attenuation. Meanwhile, statistical data for measured and predicted data such as fade duration and interfade duration have insignificant discrepancy with the model of ITU-R P.1623-1 recommendation.
Choosing algorithms for TB screening: a modelling study to compare yield, predictive value and diagnostic burden.

Science.gov (United States)

Van't Hoog, Anna H; Onozaki, Ikushi; Lonnroth, Knut

2014-10-19

To inform the choice of an appropriate screening and diagnostic algorithm for tuberculosis (TB) screening initiatives in different epidemiological settings, we compare algorithms composed of currently available methods. Of twelve algorithms composed of screening for symptoms (prolonged cough or any TB symptom) and/or chest radiography abnormalities, and either sputum-smear microscopy (SSM) or Xpert MTB/RIF (XP) as confirmatory test we model algorithm outcomes and summarize the yield, number needed to screen (NNS) and positive predictive value (PPV) for different levels of TB prevalence. Screening for prolonged cough has low yield, 22% if confirmatory testing is by SSM and 32% if XP, and a high NNS, exceeding 1000 if TB prevalence is ≤0.5%. Due to low specificity the PPV of screening for any TB symptom followed by SSM is less than 50%, even if TB prevalence is 2%. CXR screening for TB abnormalities followed by XP has the highest case detection (87%) and lowest NNS, but is resource intensive. CXR as a second screen for symptom screen positives improves efficiency. The ideal algorithm does not exist. The choice will be setting specific, for which this study provides guidance. Generally an algorithm composed of CXR screening followed by confirmatory testing with XP can achieve the lowest NNS and highest PPV, and is the least amenable to setting-specific variation. However resource requirements for tests and equipment may be prohibitive in some settings and a reason to opt for symptom screening and SSM. To better inform disease control programs we need empirical data to confirm the modeled yield, cost-effectiveness studies, transmission models and a better screening test.
A novel clustering algorithm based on quantum games

International Nuclear Information System (INIS)

Li Qiang; He Yan; Jiang Jingping

2009-01-01

Enormous successes have been made by quantum algorithms during the last decade. In this paper, we combine the quantum game with the problem of data clustering, and then develop a quantum-game-based clustering algorithm, in which data points in a dataset are considered as players who can make decisions and implement quantum strategies in quantum games. After each round of a quantum game, each player's expected payoff is calculated. Later, he uses a link-removing-and-rewiring (LRR) function to change his neighbors and adjust the strength of links connecting to them in order to maximize his payoff. Further, algorithms are discussed and analyzed in two cases of strategies, two payoff matrixes and two LRR functions. Consequently, the simulation results have demonstrated that data points in datasets are clustered reasonably and efficiently, and the clustering algorithms have fast rates of convergence. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the proposed approach.
Identification and prediction of dynamic systems using an interactively recurrent self-evolving fuzzy neural network.

Science.gov (United States)

Lin, Yang-Yin; Chang, Jyh-Yeong; Lin, Chin-Teng

2013-02-01

This paper presents a novel recurrent fuzzy neural network, called an interactively recurrent self-evolving fuzzy neural network (IRSFNN), for prediction and identification of dynamic systems. The recurrent structure in an IRSFNN is formed as an external loops and internal feedback by feeding the rule firing strength of each rule to others rules and itself. The consequent part in the IRSFNN is composed of a Takagi-Sugeno-Kang (TSK) or functional-link-based type. The proposed IRSFNN employs a functional link neural network (FLNN) to the consequent part of fuzzy rules for promoting the mapping ability. Unlike a TSK-type fuzzy neural network, the FLNN in the consequent part is a nonlinear function of input variables. An IRSFNNs learning starts with an empty rule base and all of the rules are generated and learned online through a simultaneous structure and parameter learning. An on-line clustering algorithm is effective in generating fuzzy rules. The consequent update parameters are derived by a variable-dimensional Kalman filter algorithm. The premise and recurrent parameters are learned through a gradient descent algorithm. We test the IRSFNN for the prediction and identification of dynamic plants and compare it to other well-known recurrent FNNs. The proposed model obtains enhanced performance results.
HomoTarget: a new algorithm for prediction of microRNA targets in Homo sapiens.

Science.gov (United States)

Ahmadi, Hamed; Ahmadi, Ali; Azimzadeh-Jamalkandi, Sadegh; Shoorehdeli, Mahdi Aliyari; Salehzadeh-Yazdi, Ali; Bidkhori, Gholamreza; Masoudi-Nejad, Ali

2013-02-01

MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates. Copyright © 2012 Elsevier Inc. All rights reserved.
Efficient Geo-Computational Algorithms for Constructing Space-Time Prisms in Road Networks

Directory of Open Access Journals (Sweden)

Hui-Ping Chen

2016-11-01

Full Text Available The Space-time prism (STP is a key concept in time geography for analyzing human activity-travel behavior under various Space-time constraints. Most existing time-geographic studies use a straightforward algorithm to construct STPs in road networks by using two one-to-all shortest path searches. However, this straightforward algorithm can introduce considerable computational overhead, given the fact that accessible links in a STP are generally a small portion of the whole network. To address this issue, an efficient geo-computational algorithm, called NTP-A*, is proposed. The proposed NTP-A* algorithm employs the A* and branch-and-bound techniques to discard inaccessible links during two shortest path searches, and thereby improves the STP construction performance. Comprehensive computational experiments are carried out to demonstrate the computational advantage of the proposed algorithm. Several implementation techniques, including the label-correcting technique and the hybrid link-node labeling technique, are discussed and analyzed. Experimental results show that the proposed NTP-A* algorithm can significantly improve STP construction performance in large-scale road networks by a factor of 100, compared with existing algorithms.
Predicting recurrent aphthous ulceration using genetic algorithms-optimized neural networks

Directory of Open Access Journals (Sweden)

Najla S Dar-Odeh

2010-05-01

Full Text Available Najla S Dar-Odeh1, Othman M Alsmadi2, Faris Bakri3, Zaer Abu-Hammour2, Asem A Shehabi3, Mahmoud K Al-Omiri1, Shatha M K Abu-Hammad4, Hamzeh Al-Mashni4, Mohammad B Saeed4, Wael Muqbil4, Osama A Abu-Hammad1 1Faculty of Dentistry, 2Faculty of Engineering and Technology, 3Faculty of Medicine, University of Jordan, Amman, Jordan; 4Dental Department, University of Jordan Hospital, Amman, JordanObjective: To construct and optimize a neural network that is capable of predicting the occurrence of recurrent aphthous ulceration (RAU based on a set of appropriate input data.Participants and methods: Artificial neural networks (ANN software employing genetic algorithms to optimize the architecture neural networks was used. Input and output data of 86 participants (predisposing factors and status of the participants with regards to recurrent aphthous ulceration were used to construct and train the neural networks. The optimized neural networks were then tested using untrained data of a further 10 participants.Results: The optimized neural network, which produced the most accurate predictions for the presence or absence of recurrent aphthous ulceration was found to employ: gender, hematological (with or without ferritin and mycological data of the participants, frequency of tooth brushing, and consumption of vegetables and fruits.Conclusions: Factors appearing to be related to recurrent aphthous ulceration and appropriate for use as input data to construct ANNs that predict recurrent aphthous ulceration were found to include the following: gender, hemoglobin, serum vitamin B12, serum ferritin, red cell folate, salivary candidal colony count, frequency of tooth brushing, and the number of fruits or vegetables consumed daily.Keywords: artifical neural networks, recurrent, aphthous ulceration, ulcer
Enhanced backpropagation training algorithm for transient event identification

International Nuclear Information System (INIS)

Vitela, J.; Reifman, J.

1993-01-01

We present an enhanced backpropagation (BP) algorithm for training feedforward neural networks that avoids the undesirable premature saturation of the network output nodes and accelerates the training process even in cases where premature saturation is not present. When the standard BP algorithm is applied to train patterns of nuclear power plant (NPP) transients, the network output nodes often become prematurely saturated causing the already slow rate of convergence of the algorithm to become even slower. When premature saturation occurs, the gradient of the prediction error becomes very small, although the prediction error itself is still large, yielding negligible weight updates and hence no significant decrease in the prediction error until the eventual recovery of the output nodes from saturation. By defining the onset of premature saturation and systematically modifying the gradient of the prediction error at saturation, we developed an enhanced BP algorithm that is compared with the standard BP algorithm in training a network to identify NPP transients
Output Current Ripple Reduction Algorithms for Home Energy Storage Systems

Directory of Open Access Journals (Sweden)

Jin-Hyuk Park

2013-10-01

Full Text Available This paper proposes an output current ripple reduction algorithm using a proportional-integral (PI controller for an energy storage system (ESS. In single-phase systems, the DC/AC inverter has a second-order harmonic at twice the grid frequency of a DC-link voltage caused by pulsation of the DC-link voltage. The output current of a DC/DC converter has a ripple component because of the ripple of the DC-link voltage. The second-order harmonic adversely affects the battery lifetime. The proposed algorithm has an advantage of reducing the second-order harmonic of the output current in the variable frequency system. The proposed algorithm is verified from the PSIM simulation and experiment with the 3 kW ESS model.
Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms

Science.gov (United States)

Schädler, Marc R.; Warzybok, Anna; Kollmeier, Birger

2018-01-01

The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than −20 dB could not be predicted. PMID:29692200
Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

Science.gov (United States)

Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.
A combination of compositional index and genetic algorithm for predicting transmembrane helical segments.

Directory of Open Access Journals (Sweden)

Nazar Zaki

Full Text Available Transmembrane helix (TMH topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method.The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.
Topology control algorithm for wireless sensor networks based on Link forwarding

Science.gov (United States)

Pucuo, Cairen; Qi, Ai-qin

2018-03-01

The research of topology control could effectively save energy and increase the service life of network based on wireless sensor. In this paper, a arithmetic called LTHC (link transmit hybrid clustering) based on link transmit is proposed. It decreases expenditure of energy by changing the way of cluster-node’s communication. The idea is to establish a link between cluster and SINK node when the cluster is formed, and link-node must be non-cluster. Through the link, cluster sends information to SINK nodes. For the sake of achieving the uniform distribution of energy on the network, prolongate the network survival time, and improve the purpose of communication, the communication will cut down much more expenditure of energy for cluster which away from SINK node. In the two aspects of improving the traffic and network survival time, we find that the LTCH is far superior to the traditional LEACH by experiments.
Prediction of Aerodynamic Coefficients for Wind Tunnel Data using a Genetic Algorithm Optimized Neural Network

Science.gov (United States)

Rajkumar, T.; Aragon, Cecilia; Bardina, Jorge; Britten, Roy

2002-01-01

A fast, reliable way of predicting aerodynamic coefficients is produced using a neural network optimized by a genetic algorithm. Basic aerodynamic coefficients (e.g. lift, drag, pitching moment) are modelled as functions of angle of attack and Mach number. The neural network is first trained on a relatively rich set of data from wind tunnel tests of numerical simulations to learn an overall model. Most of the aerodynamic parameters can be well-fitted using polynomial functions. A new set of data, which can be relatively sparse, is then supplied to the network to produce a new model consistent with the previous model and the new data. Because the new model interpolates realistically between the sparse test data points, it is suitable for use in piloted simulations. The genetic algorithm is used to choose a neural network architecture to give best results, avoiding over-and under-fitting of the test data.
Fast Simulation of 3-D Surface Flanging and Prediction of the Flanging Lines Based On One-Step Inverse Forming Algorithm

International Nuclear Information System (INIS)

Bao Yidong; Hu Sibo; Lang Zhikui; Hu Ping

2005-01-01

A fast simulation scheme for 3D curved binder flanging and blank shape prediction of sheet metal based on one-step inverse finite element method is proposed, in which the total plasticity theory and proportional loading assumption are used. The scheme can be actually used to simulate 3D flanging with complex curve binder shape, and suitable for simulating any type of flanging model by numerically determining the flanging height and flanging lines. Compared with other methods such as analytic algorithm and blank sheet-cut return method, the prominent advantage of the present scheme is that it can directly predict the location of the 3D flanging lines when simulating the flanging process. Therefore, the prediction time of flanging lines will be obviously decreased. Two typical 3D curve binder flanging including stretch and shrink characters are simulated in the same time by using the present scheme and incremental FE non-inverse algorithm based on incremental plasticity theory, which show the validity and high efficiency of the present scheme
Algorithms for Reinforcement Learning

CERN Document Server

Szepesvari, Csaba

2010-01-01

Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms'
The Top Ten Algorithms in Data Mining

CERN Document Server

Wu, Xindong

2009-01-01

From classification and clustering to statistical learning, association analysis, and link mining, this book covers the most important topics in data mining research. It presents the ten most influential algorithms used in the data mining community today. Each chapter provides a detailed description of the algorithm, a discussion of available software implementation, advanced topics, and exercises. With a simple data set, examples illustrate how each algorithm works and highlight the overall performance of each algorithm in a real-world application. Featuring contributions from leading researc

Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method

Science.gov (United States)

Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty

2017-03-01

Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.
A Rule-Based Model for Bankruptcy Prediction Based on an Improved Genetic Ant Colony Algorithm

Directory of Open Access Journals (Sweden)

Yudong Zhang

2013-01-01

Full Text Available In this paper, we proposed a hybrid system to predict corporate bankruptcy. The whole procedure consists of the following four stages: first, sequential forward selection was used to extract the most important features; second, a rule-based model was chosen to fit the given dataset since it can present physical meaning; third, a genetic ant colony algorithm (GACA was introduced; the fitness scaling strategy and the chaotic operator were incorporated with GACA, forming a new algorithm—fitness-scaling chaotic GACA (FSCGACA, which was used to seek the optimal parameters of the rule-based model; and finally, the stratified K-fold cross-validation technique was used to enhance the generalization of the model. Simulation experiments of 1000 corporations’ data collected from 2006 to 2009 demonstrated that the proposed model was effective. It selected the 5 most important factors as “net income to stock broker’s equality,” “quick ratio,” “retained earnings to total assets,” “stockholders’ equity to total assets,” and “financial expenses to sales.” The total misclassification error of the proposed FSCGACA was only 7.9%, exceeding the results of genetic algorithm (GA, ant colony algorithm (ACA, and GACA. The average computation time of the model is 2.02 s.
Improving protein function prediction methods with integrated literature data

Directory of Open Access Journals (Sweden)

Gabow Aaron P

2008-04-01

, but co-occurrence data still proves beneficial. Conclusion Co-occurrence data is a valuable supplemental source for graph-theoretic function prediction algorithms. A rapidly growing literature corpus ensures that co-occurrence data is a readily-available resource for nearly every studied organism, particularly those with small protein interaction databases. Though arguably biased toward known genes, co-occurrence data provides critical additional links to well-studied regions in the interaction network that graph-theoretic function prediction algorithms can exploit.
Generalized Predictive Control and Neural Generalized Predictive Control

Directory of Open Access Journals (Sweden)

Sadhana CHIDRAWAR

2008-12-01

Full Text Available As Model Predictive Control (MPC relies on the predictive Control using a multilayer feed forward network as the plants linear model is presented. In using Newton-Raphson as the optimization algorithm, the number of iterations needed for convergence is significantly reduced from other techniques. This paper presents a detailed derivation of the Generalized Predictive Control and Neural Generalized Predictive Control with Newton-Raphson as minimization algorithm. Taking three separate systems, performances of the system has been tested. Simulation results show the effect of neural network on Generalized Predictive Control. The performance comparison of this three system configurations has been given in terms of ISE and IAE.
Development and validation of a risk prediction algorithm for the recurrence of suicidal ideation among general population with low mood.

Science.gov (United States)

Liu, Y; Sareen, J; Bolton, J M; Wang, J L

2016-03-15

Suicidal ideation is one of the strongest predictors of recent and future suicide attempt. This study aimed to develop and validate a risk prediction algorithm for the recurrence of suicidal ideation among population with low mood 3035 participants from U.S National Epidemiologic Survey on Alcohol and Related Conditions with suicidal ideation at their lowest mood at baseline were included. The Alcohol Use Disorder and Associated Disabilities Interview Schedule, based on the DSM-IV criteria was used. Logistic regression modeling was conducted to derive the algorithm. Discrimination and calibration were assessed in the development and validation cohorts. In the development data, the proportion of recurrent suicidal ideation over 3 years was 19.5 (95% CI: 17.7, 21.5). The developed algorithm consisted of 6 predictors: age, feelings of emptiness, sudden mood changes, self-harm history, depressed mood in the past 4 weeks, interference with social activities in the past 4 weeks because of physical health or emotional problems and emptiness was the most important risk factor. The model had good discriminative power (C statistic=0.8273, 95% CI: 0.8027, 0.8520). The C statistic was 0.8091 (95% CI: 0.7786, 0.8395) in the external validation dataset and was 0.8193 (95% CI: 0.8001, 0.8385) in the combined dataset. This study does not apply to people with suicidal ideation who are not depressed. The developed risk algorithm for predicting the recurrence of suicidal ideation has good discrimination and excellent calibration. Clinicians can use this algorithm to stratify the risk of recurrence in patients and thus improve personalized treatment approaches, make advice and further intensive monitoring. Copyright © 2016 Elsevier B.V. All rights reserved.
SPECIAL LIBRARIES OF FRAGMENTS OF ALGORITHMIC NETWORKS TO AUTOMATE THE DEVELOPMENT OF ALGORITHMIC MODELS

Directory of Open Access Journals (Sweden)

V. E. Marley

2015-01-01

Full Text Available Summary. The concept of algorithmic models appeared from the algorithmic approach in which the simulated object, the phenomenon appears in the form of process, subject to strict rules of the algorithm, which placed the process of operation of the facility. Under the algorithmic model is the formalized description of the scenario subject specialist for the simulated process, the structure of which is comparable with the structure of the causal and temporal relationships between events of the process being modeled, together with all information necessary for its software implementation. To represent the structure of algorithmic models used algorithmic network. Normally, they were defined as loaded finite directed graph, the vertices which are mapped to operators and arcs are variables, bound by operators. The language of algorithmic networks has great features, the algorithms that it can display indifference the class of all random algorithms. In existing systems, automation modeling based on algorithmic nets, mainly used by operators working with real numbers. Although this reduces their ability, but enough for modeling a wide class of problems related to economy, environment, transport, technical processes. The task of modeling the execution of schedules and network diagrams is relevant and useful. There are many counting systems, network graphs, however, the monitoring process based analysis of gaps and terms of graphs, no analysis of prediction execution schedule or schedules. The library is designed to build similar predictive models. Specifying source data to obtain a set of projections from which to choose one and take it for a new plan.
Prediction of customer behaviour analysis using classification algorithms

Science.gov (United States)

Raju, Siva Subramanian; Dhandayudam, Prabha

2018-04-01

Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.
Safety, Efficacy, Predictability and Stability Indices of Photorefractive Keratectomy for Correction of Myopic Astigmatism with Plano-Scan and Tissue-Saving Algorithms

Directory of Open Access Journals (Sweden)

Mehrdad Mohammadpour

2013-10-01

Full Text Available Purpose: To assess the safety, efficacy and predictability of photorefractive keratectomy (PRK [Tissue-saving (TS versus Plano-scan (PS ablation algorithms] of Technolas 217z excimer laser for correction of myopic astigmatismMethods: In this retrospective study one hundred and seventy eyes of 85 patients (107 eyes (62.9% with PS and 63 eyes (37.1% with TS algorithm were included. TS algorithm was applied for those with central corneal thickness less than 500 µm or estimated residual stromal thickness less than 420 µm. Mitomycin C (MMC was applied for 120 eyes (70.6%; in case of an ablation depth more than 60 μm and/or astigmatic correction more than one diopter (D. Mean sphere, cylinder, spherical equivalent (SE refraction, uncorrected visual acuity (UCVA, best corrected visual acuity (BCVA were measured preoperatively, and 4 weeks,12 weeks and 24 weeks postoperatively.Results: One, three and six months postoperatively, 60%, 92.9%, 97.5% of eyes had UCVA of 20/20 or better, respectively. Mean preoperative and 1, 3, 6 months postoperative SE were -3.48±1.28 D (-1.00 to -8.75, -0.08±0.62D, -0.02±0.57 and -0.004± 0.29, respectively. And also, 87.6%, 94.1% and 100% were within ±1.0 D of emmetropia and 68.2, 75.3, 95% were within ±0.5 of emmetropia. The safety and efficacy indices were 0.99 and 0.99 at 12 weeks and 1.009 and 0.99 at 24 weeks, respectively. There was no clinically or statistically significant difference between the outcomes of PS or TS algorithms or between those with or without MMC in either group in terms of safety, efficacy, predictability or stability. Dividing the eyes with subjective SE≤4 D and SE≥4 D postoperatively, there was no significant difference between the predictability of the two groups. There was no intra- or postoperative complication.Conclusion: Outcomes of PRK for correction of myopic astigmatism showed great promise with both PS and TS algorithms.
On Efficient Link Recommendation in Social Networks Using Actor-Fact Matrices

Directory of Open Access Journals (Sweden)

Michał Ciesielczyk

2015-01-01

Full Text Available Link recommendation is a popular research subject in the field of social network analysis and mining. Often, the main emphasis is put on the development of new recommendation algorithms, semantic enhancements to existing solutions, design of new similarity measures, and so forth. However, relatively little scientific attention has been paid to the impact that various data representation models have on the performance of recommendation algorithms. And by performance we do not mean the time or memory efficiency of algorithms, but the precision and recall of recommender systems. Our recent findings unanimously show that the choice of network representation model has an important and measurable impact on the quality of recommendations. In this paper we argue that the computation quality of link recommendation algorithms depends significantly on the social network representation and we advocate the use of actor-fact matrix as the best alternative. We verify our findings using several state-of-the-art link recommendation algorithms, such as SVD, RSVD, and RRI using both single-relation and multirelation dataset.
Decoding the Brain’s Algorithm for Categorization from its Neural Implementation

Science.gov (United States)

Mack, Michael L.; Preston, Alison R.; Love, Bradley C.

2013-01-01

Summary Acts of cognition can be described at different levels of analysis: what behavior should characterize the act, what algorithms and representations underlie the behavior, and how the algorithms are physically realized in neural activity [1]. Theories that bridge levels of analysis offer more complete explanations by leveraging the constraints present at each level [2–4]. Despite the great potential for theoretical advances, few studies of cognition bridge levels of analysis. For example, formal cognitive models of category decisions accurately predict human decision making [5, 6], but whether model algorithms and representations supporting category decisions are consistent with underlying neural implementation remains unknown. This uncertainty is largely due to the hurdle of forging links between theory and brain [7–9]. Here, we tackle this critical problem by using brain response to characterize the nature of mental computations that support category decisions to evaluate two dominant, and opposing, models of categorization. We found that brain states during category decisions were significantly more consistent with latent model representations from exemplar [5] rather than prototype theory [10, 11]. Representations of individual experiences, not the abstraction of experiences, are critical for category decision making. Holding models accountable for behavior and neural implementation provides a means for advancing more complete descriptions of the algorithms of cognition. PMID:24094852
Detection of the dominant direction of information flow and feedback links in densely interconnected regulatory networks

Directory of Open Access Journals (Sweden)

Ispolatov Iaroslav

2008-10-01

Full Text Available Abstract Background Finding the dominant direction of flow of information in densely interconnected regulatory or signaling networks is required in many applications in computational biology and neuroscience. This is achieved by first identifying and removing links which close up feedback loops in the original network and hierarchically arranging nodes in the remaining network. In mathematical language this corresponds to a problem of making a graph acyclic by removing as few links as possible and thus altering the original graph in the least possible way. The exact solution of this problem requires enumeration of all cycles and combinations of removed links, which, as an NP-hard problem, is computationally prohibitive even for modest-size networks. Results We introduce and compare two approximate numerical algorithms for solving this problem: the probabilistic one based on a simulated annealing of the hierarchical layout of the network which minimizes the number of "backward" links going from lower to higher hierarchical levels, and the deterministic, "greedy" algorithm that sequentially cuts the links that participate in the largest number of feedback cycles. We find that the annealing algorithm outperforms the deterministic one in terms of speed, memory requirement, and the actual number of removed links. To further improve a visual perception of the layout produced by the annealing algorithm, we perform an additional minimization of the length of hierarchical links while keeping the number of anti-hierarchical links at their minimum. The annealing algorithm is then tested on several examples of regulatory and signaling networks/pathways operating in human cells. Conclusion The proposed annealing algorithm is powerful enough to performs often optimal layouts of protein networks in whole organisms, consisting of around ~104 nodes and ~105 links, while the applicability of the greedy algorithm is limited to individual pathways with ~100
Homophyly/kinship hypothesis: Natural communities, and predicting in networks

Science.gov (United States)

Li, Angsheng; Li, Jiankou; Pan, Yicheng

2015-02-01

It has been a longstanding challenge to understand natural communities in real world networks. We proposed a community finding algorithm based on fitness of networks, two algorithms for prediction, accurate prediction and confirmation of keywords for papers in the citation network Arxiv HEP-TH (high energy physics theory), and the measures of internal centrality, external de-centrality, internal and external slopes to characterize the structures of communities. We implemented our algorithms on 2 citation and 5 cooperation graphs. Our experiments explored and validated a homophyly/kinship principle of real world networks. The homophyly/kinship principle includes: (1) homophyly is the natural selection in real world networks, similar to Darwin's kinship selection in nature, (2) real world networks consist of natural communities generated by the natural selection of homophyly, (3) most individuals in a natural community share a short list of common attributes, (4) natural communities have an internal centrality (or internal heterogeneity) that a natural community has a few nodes dominating most of the individuals in the community, (5) natural communities have an external de-centrality (or external homogeneity) that external links of a natural community homogeneously distributed in different communities, and (6) natural communities of a given network have typical structures determined by the internal slopes, and have typical patterns of outgoing links determined by external slopes, etc. Our homophyly/kinship principle perfectly matches Darwin's observation that animals from ants to people form social groups in which most individuals work for the common good, and that kinship could encourage altruistic behavior. Our homophyly/kinship principle is the network version of Darwinian theory, and builds a bridge between Darwinian evolution and network science.
Propensity scores-potential outcomes framework to incorporate severity probabilities in the highway safety manual crash prediction algorithm.

Science.gov (United States)

Sasidharan, Lekshmi; Donnell, Eric T

2014-10-01

Accurate estimation of the expected number of crashes at different severity levels for entities with and without countermeasures plays a vital role in selecting countermeasures in the framework of the safety management process. The current practice is to use the American Association of State Highway and Transportation Officials' Highway Safety Manual crash prediction algorithms, which combine safety performance functions and crash modification factors, to estimate the effects of safety countermeasures on different highway and street facility types. Many of these crash prediction algorithms are based solely on crash frequency, or assume that severity outcomes are unchanged when planning for, or implementing, safety countermeasures. Failing to account for the uncertainty associated with crash severity outcomes, and assuming crash severity distributions remain unchanged in safety performance evaluations, limits the utility of the Highway Safety Manual crash prediction algorithms in assessing the effect of safety countermeasures on crash severity. This study demonstrates the application of a propensity scores-potential outcomes framework to estimate the probability distribution for the occurrence of different crash severity levels by accounting for the uncertainties associated with them. The probability of fatal and severe injury crash occurrence at lighted and unlighted intersections is estimated in this paper using data from Minnesota. The results show that the expected probability of occurrence of fatal and severe injury crashes at a lighted intersection was 1 in 35 crashes and the estimated risk ratio indicates that the respective probabilities at an unlighted intersection was 1.14 times higher compared to lighted intersections. The results from the potential outcomes-propensity scores framework are compared to results obtained from traditional binary logit models, without application of propensity scores matching. Traditional binary logit analysis suggests that
CN earthquake prediction algorithm and the monitoring of the future strong Vrancea events

International Nuclear Information System (INIS)

Moldoveanu, C.L.; Radulian, M.; Novikova, O.V.; Panza, G.F.

2002-01-01

The strong earthquakes originating at intermediate-depth in the Vrancea region (located in the SE corner of the highly bent Carpathian arc) represent one of the most important natural disasters able to induce heavy effects (high tool of casualties and extensive damage) in the Romanian territory. The occurrence of these earthquakes is irregular, but not infrequent. Their effects are felt over a large territory, from Central Europe to Moscow and from Greece to Scandinavia. The largest cultural and economical center exposed to the seismic risk due to the Vrancea earthquakes is Bucharest. This metropolitan area (230 km 2 wide) is characterized by the presence of 2.5 million inhabitants (10% of the country population) and by a considerable number of high-risk structures and infrastructures. The best way to face strong earthquakes is to mitigate the seismic risk by using the two possible complementary approaches represented by (a) the antiseismic design of structures and infrastructures (able to support strong earthquakes without significant damage), and (b) the strong earthquake prediction (in terms of alarm intervals declared for long, intermediate or short-term space-and time-windows). The intermediate term medium-range earthquake prediction represents the most realistic target to be reached at the present state of knowledge. The alarm declared in this case extends over a time window of about one year or more, and a space window of a few hundreds of kilometers. In the case of Vrancea events the spatial uncertainty is much less, being of about 100 km. The main measures for the mitigation of the seismic risk allowed by the intermediate-term medium-range prediction are: (a) verification of the buildings and infrastructures stability and reinforcement measures when required, (b) elaboration of emergency plans of action, (c) schedule of the main actions required in order to restore the normality of the social and economical life after the earthquake. The paper presents the
Frontal responses during learning predict vulnerability to the psychotogenic effects of ketamine : Linking cognition, brain activity, and psychosis

NARCIS (Netherlands)

Corlett, Philip R.; Honey, Garry D.; Aitken, Michael R. F.; Dickinson, Anthony; Shanks, David R.; Absalom, Anthony R.; Lee, Michael; Pomarol-Clotet, Edith; Murray, Graham K.; McKenna, Peter J.; Robbins, Trevor W.; Bullmore, Edward T.; Fletcher, Paul C.

Context: Establishing a neurobiological account of delusion formation that links cognitive processes, brain activity, and symptoms is important to furthering our understanding of psychosis. Objective: To explore a theoretical model of delusion formation that implicates prediction error - dependent
Chronic obstructive pulmonary disease and coronary disease: COPDCoRi, a simple and effective algorithm for predicting the risk of coronary artery disease in COPD patients.

Science.gov (United States)

Cazzola, Mario; Calzetta, Luigino; Matera, Maria Gabriella; Muscoli, Saverio; Rogliani, Paola; Romeo, Francesco

2015-08-01

Chronic obstructive pulmonary disease (COPD) is often associated with cardiovascular artery disease (CAD), representing a potential and independent risk factor for cardiovascular morbidity. Therefore, the aim of this study was to identify an algorithm for predicting the risk of CAD in COPD patients. We analyzed data of patients afferent to the Cardiology ward and the Respiratory Diseases outpatient clinic of Tor Vergata University (2010-2012, 1596 records). The study population was clustered as training population (COPD patients undergoing coronary arteriography), control population (non-COPD patients undergoing coronary arteriography), test population (COPD patients whose records reported information on the coronary status). The predicting model was built via causal relationship between variables, stepwise binary logistic regression and Hosmer-Lemeshow analysis. The algorithm was validated via split-sample validation method and receiver operating characteristics (ROC) curve analysis. The diagnostic accuracy was assessed. In training population the variables gender (men/women OR: 1.7, 95%CI: 1.237-2.5, P COPD patients, whereas in control population also age and diabetes were correlated. The stepwise binary logistic regressions permitted to build a well fitting predictive model for training population but not for control population. The predictive algorithm shown a diagnostic accuracy of 81.5% (95%CI: 77.78-84.71) and an AUC of 0.81 (95%CI: 0.78-0.85) for the validation set. The proposed algorithm is effective for predicting the risk of CAD in COPD patients via a rapid, inexpensive and non-invasive approach. Copyright © 2015 Elsevier Ltd. All rights reserved.
XTALOPT version r11: An open-source evolutionary algorithm for crystal structure prediction

Science.gov (United States)

Avery, Patrick; Falls, Zackary; Zurek, Eva

2018-01-01

Version 11 of XTALOPT, an evolutionary algorithm for crystal structure prediction, has now been made available for download from the CPC library or the XTALOPT website, http://xtalopt.github.io. Whereas the previous versions of XTALOPT were published under the Gnu Public License (GPL), the current version is made available under the 3-Clause BSD License, which is an open source license that is recognized by the Open Source Initiative. Importantly, the new version can be executed via a command line interface (i.e., it does not require the use of a Graphical User Interface). Moreover, the new version is written as a stand-alone program, rather than an extension to AVOGADRO.
Hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) and its application to predicting key process variables.

Science.gov (United States)

He, Yan-Lin; Xu, Yuan; Geng, Zhi-Qiang; Zhu, Qun-Xiong

2016-03-01

In this paper, a hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) is proposed. Firstly, an improved functional link neural network with small norm of expanded weights and high input-output correlation (SNEWHIOC-FLNN) was proposed for enhancing the generalization performance of FLNN. Unlike the traditional FLNN, the expanded variables of the original inputs are not directly used as the inputs in the proposed SNEWHIOC-FLNN model. The original inputs are attached to some small norm of expanded weights. As a result, the correlation coefficient between some of the expanded variables and the outputs is enhanced. The larger the correlation coefficient is, the more relevant the expanded variables tend to be. In the end, the expanded variables with larger correlation coefficient are selected as the inputs to improve the performance of the traditional FLNN. In order to test the proposed SNEWHIOC-FLNN model, three UCI (University of California, Irvine) regression datasets named Housing, Concrete Compressive Strength (CCS), and Yacht Hydro Dynamics (YHD) are selected. Then a hybrid model based on the improved FLNN integrating with partial least square (IFLNN-PLS) was built. In IFLNN-PLS model, the connection weights are calculated using the partial least square method but not the error back propagation algorithm. Lastly, IFLNN-PLS was developed as an intelligent measurement model for accurately predicting the key variables in the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. Simulation results illustrated that the IFLNN-PLS could significant improve the prediction performance. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Sensitivity and Uncertainty Analysis for Streamflow Prediction Using Different Objective Functions and Optimization Algorithms: San Joaquin California

Science.gov (United States)

Paul, M.; Negahban-Azar, M.

2017-12-01

The hydrologic models usually need to be calibrated against observed streamflow at the outlet of a particular drainage area through a careful model calibration. However, a large number of parameters are required to fit in the model due to their unavailability of the field measurement. Therefore, it is difficult to calibrate the model for a large number of potential uncertain model parameters. This even becomes more challenging if the model is for a large watershed with multiple land uses and various geophysical characteristics. Sensitivity analysis (SA) can be used as a tool to identify most sensitive model parameters which affect the calibrated model performance. There are many different calibration and uncertainty analysis algorithms which can be performed with different objective functions. By incorporating sensitive parameters in streamflow simulation, effects of the suitable algorithm in improving model performance can be demonstrated by the Soil and Water Assessment Tool (SWAT) modeling. In this study, the SWAT was applied in the San Joaquin Watershed in California covering 19704 km2 to calibrate the daily streamflow. Recently, sever water stress escalating due to intensified climate variability, prolonged drought and depleting groundwater for agricultural irrigation in this watershed. Therefore it is important to perform a proper uncertainty analysis given the uncertainties inherent in hydrologic modeling to predict the spatial and temporal variation of the hydrologic process to evaluate the impacts of different hydrologic variables. The purpose of this study was to evaluate the sensitivity and uncertainty of the calibrated parameters for predicting streamflow. To evaluate the sensitivity of the calibrated parameters three different optimization algorithms (Sequential Uncertainty Fitting- SUFI-2, Generalized Likelihood Uncertainty Estimation- GLUE and Parameter Solution- ParaSol) were used with four different objective functions (coefficient of determination
Design of an Interface for Page Rank Calculation using Web Link Attributes Information

Directory of Open Access Journals (Sweden)

Jeyalatha SIVARAMAKRISHNAN

2010-01-01

Full Text Available This paper deals with the Web Structure Mining and the different Structure Mining Algorithms like Page Rank, HITS, Trust Rank and Sel-HITS. The functioning of these algorithms are discussed. An incremental algorithm for calculation of PageRank using an interface has been formulated. This algorithm makes use of Web Link Attributes Information as key parameters and has been implemented using Visibility and Position of a Link. The application of Web Structure Mining Algorithm in an Academic Search Application has been discussed. The present work can be a useful input to Web Users, Faculty, Students and Web Administrators in a University Environment.

GPS 2.1: enhanced prediction of kinase-specific phosphorylation sites with an algorithm of motif length selection.

Science.gov (United States)

Xue, Yu; Liu, Zexian; Cao, Jun; Ma, Qian; Gao, Xinjiao; Wang, Qingqi; Jin, Changjiang; Zhou, Yanhong; Wen, Longping; Ren, Jian

2011-03-01

As the most important post-translational modification of proteins, phosphorylation plays essential roles in all aspects of biological processes. Besides experimental approaches, computational prediction of phosphorylated proteins with their kinase-specific phosphorylation sites has also emerged as a popular strategy, for its low-cost, fast-speed and convenience. In this work, we developed a kinase-specific phosphorylation sites predictor of GPS 2.1 (Group-based Prediction System), with a novel but simple approach of motif length selection (MLS). By this approach, the robustness of the prediction system was greatly improved. All algorithms in GPS old versions were also reserved and integrated in GPS 2.1. The online service and local packages of GPS 2.1 were implemented in JAVA 1.5 (J2SE 5.0) and freely available for academic researches at: http://gps.biocuckoo.org.
MHC Class II epitope predictive algorithms

DEFF Research Database (Denmark)

Nielsen, Morten; Lund, Ole; Buus, S

2010-01-01

Major histocompatibility complex class II (MHC-II) molecules sample peptides from the extracellular space, allowing the immune system to detect the presence of foreign microbes from this compartment. To be able to predict the immune response to given pathogens, a number of methods have been...... developed to predict peptide-MHC binding. However, few methods other than the pioneering TEPITOPE/ProPred method have been developed for MHC-II. Despite recent progress in method development, the predictive performance for MHC-II remains significantly lower than what can be obtained for MHC-I. One reason...
LinkMind: link optimization in swarming mobile sensor networks.

Science.gov (United States)

Ngo, Trung Dung

2011-01-01

A swarming mobile sensor network is comprised of a swarm of wirelessly connected mobile robots equipped with various sensors. Such a network can be applied in an uncertain environment for services such as cooperative navigation and exploration, object identification and information gathering. One of the most advantageous properties of the swarming wireless sensor network is that mobile nodes can work cooperatively to organize an ad-hoc network and optimize the network link capacity to maximize the transmission of gathered data from a source to a target. This paper describes a new method of link optimization of swarming mobile sensor networks. The new method is based on combination of the artificial potential force guaranteeing connectivities of the mobile sensor nodes and the max-flow min-cut theorem of graph theory ensuring optimization of the network link capacity. The developed algorithm is demonstrated and evaluated in simulation.
LinkMind: Link Optimization in Swarming Mobile Sensor Networks

Directory of Open Access Journals (Sweden)

Trung Dung Ngo

2011-08-01

Full Text Available A swarming mobile sensor network is comprised of a swarm of wirelessly connected mobile robots equipped with various sensors. Such a network can be applied in an uncertain environment for services such as cooperative navigation and exploration, object identification and information gathering. One of the most advantageous properties of the swarming wireless sensor network is that mobile nodes can work cooperatively to organize an ad-hoc network and optimize the network link capacity to maximize the transmission of gathered data from a source to a target. This paper describes a new method of link optimization of swarming mobile sensor networks. The new method is based on combination of the artificial potential force guaranteeing connectivities of the mobile sensor nodes and the max-flow min-cut theorem of graph theory ensuring optimization of the network link capacity. The developed algorithm is demonstrated and evaluated in simulation.
A Fine-Grained API Link Prediction Approach Supporting CMDA Mashup Recommendation

Science.gov (United States)

Zhang, J.; Bao, Q.; Lee, T. J.; Ramachandran, R.; Lee, S.; Pan, L.; Gatlin, P. N.; Maskey, M.

2017-12-01

Service (API) discovery and recommendation is key to the wide spread of service oriented architecture and service oriented software engineering. Service recommendation typically relies on service linkage prediction calculated by the semantic distances (or similarities) among services based on their collection of inherent attributes. Given a specific context (mashup goal), however, different attributes may contribute differently to a service linkage. In this work, instead of training a model for all attributes as a whole, a novel approach is presented to simultaneously train separate models for individual attributes. Our contributions are summarized in three-fold. First is that we have developed a scalable attribute-level data model, featuring scalability and extensibility. We have extended Multiplicative Attribute Graph (MAG) model to represent node profiles featuring rich categorical attributes, while relaxing its constraint of requiring a priori knowledge of predefined attributes. LDA is leveraged to dynamically identify attributes based on attribute modeling, and multiple Gaussian fit is applied to find global optimal values. The second contribution is that we have seamlessly integrated the latent relationships between API attributes as well as observed network structure based on historical API usage data. Such a layered information model enables us to predict the probability of a link between two APIs based on their attribute link affinities carrying a variety of information including meta data, semantic data, historical usage data, as well as crowdsourcing user comments and annotations. The third contribution is that we have developed a finegrained context-aware mashup-API recommendation technique. On top of individual models trained for separate attributes, a dedicated layer is trained to represent the latent attribute distribution regarding mashup purpose, i.e., sensitivity of attributes to context. Thus, given the description of an intended mashup, the
PREDICTION BASED CHANNEL-HOPPING ALGORITHM FOR RENDEZVOUS IN COGNITIVE RADIO NETWORKS

Directory of Open Access Journals (Sweden)

Dhananjay Kumar

2012-12-01

Full Text Available Most common works for rendezvous in cognitive radio networks deal only with two user scenarios involving two secondary users and variable primary users and aim at reducing the time-to-rendezvous. A common control channel for the establishment of communication is not considered and hence the work comes under the category of ‘Blind Rendezvous’. Our work deal with multi-user scenario and provides a methodology for the users to find each other in the very first time slot spent for rendezvous or otherwise called the firstattempt- rendezvous. The secondary users make use of the history of past communications to enable them to predict the frequency channel that the user expects the rendezvous user to be. Our approach prevents greedy decision making between the users involved by the use of a cut-off time period for attempting rendezvous. Simulation results show that the time-to-rendezvous (TTR is greatly reduced upon comparison with other popular rendezvous algorithms.
Improved feature selection based on genetic algorithms for real time disruption prediction on JET

International Nuclear Information System (INIS)

Rattá, G.A.; Vega, J.; Murari, A.

2012-01-01

Highlights: ► A new signal selection methodology to improve disruption prediction is reported. ► The approach is based on Genetic Algorithms. ► An advanced predictor has been created with the new set of signals. ► The new system obtains considerably higher prediction rates. - Abstract: The early prediction of disruptions is an important aspect of the research in the field of Tokamak control. A very recent predictor, called “Advanced Predictor Of Disruptions” (APODIS), developed for the “Joint European Torus” (JET), implements the real time recognition of incoming disruptions with the best success rate achieved ever and an outstanding stability for long periods following training. In this article, a new methodology to select the set of the signals’ parameters in order to maximize the performance of the predictor is reported. The approach is based on “Genetic Algorithms” (GAs). With the feature selection derived from GAs, a new version of APODIS has been developed. The results are significantly better than the previous version not only in terms of success rates but also in extending the interval before the disruption in which reliable predictions are achieved. Correct disruption predictions with a success rate in excess of 90% have been achieved 200 ms before the time of the disruption. The predictor response is compared with that of JET's Protection System (JPS) and the ADODIS predictor is shown to be far superior. Both systems have been carefully tested with a wide number of discharges to understand their relative merits and the most profitable directions of further improvements.
An improved algorithm for finding all minimal paths in a network

International Nuclear Information System (INIS)

Bai, Guanghan; Tian, Zhigang; Zuo, Ming J.

2016-01-01

Minimal paths (MPs) play an important role in network reliability evaluation. In this paper, we report an efficient recursive algorithm for finding all MPs in two-terminal networks, which consist of a source node and a sink node. A linked path structure indexed by nodes is introduced, which accepts both directed and undirected form of networks. The distance between each node and the sink node is defined, and a simple recursive algorithm is presented for labeling the distance for each node. Based on the distance between each node and the sink node, additional conditions for backtracking are incorporated to reduce the number of search branches. With the newly introduced linked node structure, the distances between each node and the sink node, and the additional backtracking conditions, an improved backtracking algorithm for searching for all MPs is developed. In addition, the proposed algorithm can be adapted to search for all minimal paths for each source–sink pair in networks consisting of multiple source nodes and/or multiple sink nodes. Through computational experiments, it is demonstrated that the proposed algorithm is more efficient than existing algorithms when the network size is not too small. The proposed algorithm becomes more advantageous as the size of the network grows. - Highlights: • A linked path structure indexed by nodes is introduced to represent networks. • Additional conditions for backtracking are proposed based on the distance of each node. • An efficient algorithm is developed to find all MPs for two-terminal networks. • The computational efficiency of the algorithm for two-terminal networks is investigated. • The computational efficiency of the algorithm for multi-terminal networks is investigated.
Cardiovascular Disease Population Risk Tool (CVDPoRT): predictive algorithm for assessing CVD risk in the community setting. A study protocol.

Science.gov (United States)

Taljaard, Monica; Tuna, Meltem; Bennett, Carol; Perez, Richard; Rosella, Laura; Tu, Jack V; Sanmartin, Claudia; Hennessy, Deirdre; Tanuseputro, Peter; Lebenbaum, Michael; Manuel, Douglas G

2014-10-23

Recent publications have called for substantial improvements in the design, conduct, analysis and reporting of prediction models. Publication of study protocols, with prespecification of key aspects of the analysis plan, can help to improve transparency, increase quality and protect against increased type I error. Valid population-based risk algorithms are essential for population health planning and policy decision-making. The purpose of this study is to develop, evaluate and apply cardiovascular disease (CVD) risk algorithms for the population setting. The Ontario sample of the Canadian Community Health Survey (2001, 2003, 2005; 77,251 respondents) will be used to assess risk factors focusing on health behaviours (physical activity, diet, smoking and alcohol use). Incident CVD outcomes will be assessed through linkage to administrative healthcare databases (619,886 person-years of follow-up until 31 December 2011). Sociodemographic factors (age, sex, immigrant status, education) and mediating factors such as presence of diabetes and hypertension will be included as predictors. Algorithms will be developed using competing risks survival analysis. The analysis plan adheres to published recommendations for the development of valid prediction models to limit the risk of overfitting and improve the quality of predictions. Key considerations are fully prespecifying the predictor variables; appropriate handling of missing data; use of flexible functions for continuous predictors; and avoiding data-driven variable selection procedures. The 2007 and 2009 surveys (approximately 50,000 respondents) will be used for validation. Calibration will be assessed overall and in predefined subgroups of importance to clinicians and policymakers. This study has been approved by the Ottawa Health Science Network Research Ethics Board. The findings will be disseminated through professional and scientific conferences, and in peer-reviewed journals. The algorithm will be accessible
Positive Predictive Values of International Classification of Diseases, 10th Revision Coding Algorithms to Identify Patients With Autosomal Dominant Polycystic Kidney Disease

Directory of Open Access Journals (Sweden)

Vinusha Kalatharan

2016-12-01

Full Text Available Background: International Classification of Diseases, 10th Revision codes (ICD-10 for autosomal dominant polycystic kidney disease (ADPKD is used within several administrative health care databases. It is unknown whether these codes identify patients who meet strict clinical criteria for ADPKD. Objective: The objective of this study is (1 to determine whether different ICD-10 coding algorithms identify adult patients who meet strict clinical criteria for ADPKD as assessed through medical chart review and (2 to assess the number of patients identified with different ADPKD coding algorithms in Ontario. Design: Validation study of health care database codes, and prevalence. Setting: Ontario, Canada. Patients: For the chart review, 201 adult patients with hospital encounters between April 1, 2002, and March 31, 2014, assigned either ICD-10 codes Q61.2 or Q61.3. Measurements: This study measured positive predictive value of the ICD-10 coding algorithms and the number of Ontarians identified with different coding algorithms. Methods: We manually reviewed a random sample of medical charts in London, Ontario, Canada, and determined whether or not ADPKD was present according to strict clinical criteria. Results: The presence of either ICD-10 code Q61.2 or Q61.3 in a hospital encounter had a positive predictive value of 85% (95% confidence interval [CI], 79%-89% and identified 2981 Ontarians (0.02% of the Ontario adult population. The presence of ICD-10 code Q61.2 in a hospital encounter had a positive predictive value of 97% (95% CI, 86%-100% and identified 394 adults in Ontario (0.003% of the Ontario adult population. Limitations: (1 We could not calculate other measures of validity; (2 the coding algorithms do not identify patients without hospital encounters; and (3 coding practices may differ between hospitals. Conclusions: Most patients with ICD-10 code Q61.2 or Q61.3 assigned during their hospital encounters have ADPKD according to the clinical
Machine Learning Algorithms for prediction of regions of high Reynolds Averaged Navier Stokes Uncertainty

Science.gov (United States)

Mishra, Aashwin; Iaccarino, Gianluca

2017-11-01

In spite of their deficiencies, RANS models represent the workhorse for industrial investigations into turbulent flows. In this context, it is essential to provide diagnostic measures to assess the quality of RANS predictions. To this end, the primary step is to identify feature importances amongst massive sets of potentially descriptive and discriminative flow features. This aids the physical interpretability of the resultant discrepancy model and its extensibility to similar problems. Recent investigations have utilized approaches such as Random Forests, Support Vector Machines and the Least Absolute Shrinkage and Selection Operator for feature selection. With examples, we exhibit how such methods may not be suitable for turbulent flow datasets. The underlying rationale, such as the correlation bias and the required conditions for the success of penalized algorithms, are discussed with illustrative examples. Finally, we provide alternate approaches using convex combinations of regularized regression approaches and randomized sub-sampling in combination with feature selection algorithms, to infer model structure from data. This research was supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).
Runoff prediction using rainfall data from microwave links: Tabor case study.

Science.gov (United States)

Stransky, David; Fencl, Martin; Bares, Vojtech

2018-05-01

Rainfall spatio-temporal distribution is of great concern for rainfall-runoff modellers. Standard rainfall observations are, however, often scarce and/or expensive to obtain. Thus, rainfall observations from non-traditional sensors such as commercial microwave links (CMLs) represent a promising alternative. In this paper, rainfall observations from a municipal rain gauge (RG) monitoring network were complemented by CMLs and used as an input to a standard urban drainage model operated by the water utility of the Tabor agglomeration (CZ). Two rainfall datasets were used for runoff predictions: (i) the municipal RG network, i.e. the observation layout used by the water utility, and (ii) CMLs adjusted by the municipal RGs. The performance was evaluated in terms of runoff volumes and hydrograph shapes. The use of CMLs did not lead to distinctively better predictions in terms of runoff volumes; however, CMLs outperformed RGs used alone when reproducing a hydrograph's dynamics (peak discharges, Nash-Sutcliffe coefficient and hydrograph's rising limb timing). This finding is promising for number of urban drainage tasks working with dynamics of the flow. Moreover, CML data can be obtained from a telecommunication operator's data cloud at virtually no cost. That makes their use attractive for cities unable to improve their monitoring infrastructure for economic or organizational reasons.
Intrinsic disorder in Viral Proteins Genome-Linked: experimental and predictive analyses

Directory of Open Access Journals (Sweden)

Van Dorsselaer Alain

2009-02-01

Full Text Available Abstract Background VPgs are viral proteins linked to the 5' end of some viral genomes. Interactions between several VPgs and eukaryotic translation initiation factors eIF4Es are critical for plant infection. However, VPgs are not restricted to phytoviruses, being also involved in genome replication and protein translation of several animal viruses. To date, structural data are still limited to small picornaviral VPgs. Recently three phytoviral VPgs were shown to be natively unfolded proteins. Results In this paper, we report the bacterial expression, purification and biochemical characterization of two phytoviral VPgs, namely the VPgs of Rice yellow mottle virus (RYMV, genus Sobemovirus and Lettuce mosaic virus (LMV, genus Potyvirus. Using far-UV circular dichroism and size exclusion chromatography, we show that RYMV and LMV VPgs are predominantly or partly unstructured in solution, respectively. Using several disorder predictors, we show that both proteins are predicted to possess disordered regions. We next extend theses results to 14 VPgs representative of the viral diversity. Disordered regions were predicted in all VPg sequences whatever the genus and the family. Conclusion Based on these results, we propose that intrinsic disorder is a common feature of VPgs. The functional role of intrinsic disorder is discussed in light of the biological roles of VPgs.
Prediction of composite fatigue life under variable amplitude loading using artificial neural network trained by genetic algorithm

Science.gov (United States)

Rohman, Muhamad Nur; Hidayat, Mas Irfan P.; Purniawan, Agung

2018-04-01

Neural networks (NN) have been widely used in application of fatigue life prediction. In the use of fatigue life prediction for polymeric-base composite, development of NN model is necessary with respect to the limited fatigue data and applicable to be used to predict the fatigue life under varying stress amplitudes in the different stress ratios. In the present paper, Multilayer-Perceptrons (MLP) model of neural network is developed, and Genetic Algorithm was employed to optimize the respective weights of NN for prediction of polymeric-base composite materials under variable amplitude loading. From the simulation result obtained with two different composite systems, named E-glass fabrics/epoxy (layups [(±45)/(0)2]S), and E-glass/polyester (layups [90/0/±45/0]S), NN model were trained with fatigue data from two different stress ratios, which represent limited fatigue data, can be used to predict another four and seven stress ratios respectively, with high accuracy of fatigue life prediction. The accuracy of NN prediction were quantified with the small value of mean square error (MSE). When using 33% from the total fatigue data for training, the NN model able to produce high accuracy for all stress ratios. When using less fatigue data during training (22% from the total fatigue data), the NN model still able to produce high coefficient of determination between the prediction result compared with obtained by experiment.
Quad-Polarization Transmission for High-Capacity IM/DD Links

DEFF Research Database (Denmark)

Estaran Tolosa, Jose Manuel; Castaneda, Mario A. Usuga; Porto da Silva, Edson

2014-01-01

We report the first experimental demonstration of IM/DD links usi ng four states of polarization. Fiber - Induced polarization rotation is compensated with a simple tracking algorithm operating on the Stokes space. The principle is prove n at 128 Gb/s over 2 - km SSMF......We report the first experimental demonstration of IM/DD links usi ng four states of polarization. Fiber - Induced polarization rotation is compensated with a simple tracking algorithm operating on the Stokes space. The principle is prove n at 128 Gb/s over 2 - km SSMF...
Predicting Solar Flares Using SDO /HMI Vector Magnetic Data Products and the Random Forest Algorithm

Energy Technology Data Exchange (ETDEWEB)

Liu, Chang; Deng, Na; Wang, Haimin [Space Weather Research Laboratory, New Jersey Institute of Technology, University Heights, Newark, NJ 07102-1982 (United States); Wang, Jason T. L., E-mail: chang.liu@njit.edu, E-mail: na.deng@njit.edu, E-mail: haimin.wang@njit.edu, E-mail: jason.t.wang@njit.edu [Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102-1982 (United States)

2017-07-10

Adverse space-weather effects can often be traced to solar flares, the prediction of which has drawn significant research interests. The Helioseismic and Magnetic Imager (HMI) produces full-disk vector magnetograms with continuous high cadence, while flare prediction efforts utilizing this unprecedented data source are still limited. Here we report results of flare prediction using physical parameters provided by the Space-weather HMI Active Region Patches (SHARP) and related data products. We survey X-ray flares that occurred from 2010 May to 2016 December and categorize their source regions into four classes (B, C, M, and X) according to the maximum GOES magnitude of flares they generated. We then retrieve SHARP-related parameters for each selected region at the beginning of its flare date to build a database. Finally, we train a machine-learning algorithm, called random forest (RF), to predict the occurrence of a certain class of flares in a given active region within 24 hr, evaluate the classifier performance using the 10-fold cross-validation scheme, and characterize the results using standard performance metrics. Compared to previous works, our experiments indicate that using the HMI parameters and RF is a valid method for flare forecasting with fairly reasonable prediction performance. To our knowledge, this is the first time that RF has been used to make multiclass predictions of solar flares. We also find that the total unsigned quantities of vertical current, current helicity, and flux near the polarity inversion line are among the most important parameters for classifying flaring regions into different classes.
CAMAC based computer--computer communications via microprocessor data links

International Nuclear Information System (INIS)

Potter, J.M.; Machen, D.R.; Naivar, F.J.; Elkins, E.P.; Simmonds, D.D.

1976-01-01

Communications between the central control computer and remote, satellite data acquisition/control stations at The Clinton P. Anderson Meson Physics Facility (LAMPF) is presently accomplished through the use of CAMAC based Data Link Modules. With the advent of the microprocessor, a new philosophy for digital data communications has evolved. Data Link modules containing microprocessor controllers provide link management and communication network protocol through algorithms executed in the Data Link microprocessor
Evaluation Of Algorithms Of Anti- HIV Antibody Tests

Directory of Open Access Journals (Sweden)

Paranjape R.S

1997-01-01

Full Text Available Research question: Can alternate algorithms be used in place of conventional algorithm for epidemiological studies of HIV infection with less expenses? Objective: To compare the results of HIV sero- prevalence as determined by test algorithms combining three kits with conventional test algorithm. Study design: Cross â€" sectional. Participants: 282 truck drivers. Statistical analysis: Sensitivity and specificity analysis and predictive values. Results: Three different algorithms that do not include Western Blot (WB were compared with the conventional algorithm, in a truck driver population with 5.6% prevalence of HIV â€"I infection. Algorithms with one EIA (Genetic Systems or Biotest and a rapid test (immunocomb or with two EIAs showed 100% positive predictive value in relation to the conventional algorithm. Using an algorithm with EIA as screening test and a rapid test as a confirmatory test was 50 to 70% less expensive than the conventional algorithm per positive scrum sample. These algorithms obviate the interpretation of indeterminate results and also give differential diagnosis of HIV-2 infection. Alternate algorithms are ideally suited for community based control programme in developing countries. Application of these algorithms in population with low prevalence should also be studied in order to evaluate universal applicability.
QIKAIM, a fast seminumerical algorithm for the generation of minute-of-arc accuracy satellite predictions

Science.gov (United States)

Vermeer, M.

1981-07-01

A program was designed to replace AIMLASER for the generation of aiming predictions, to achieve a major saving in computing time, and to keep the program small enough for use even on small systems. An approach was adopted that incorporated the numerical integration of the orbit through a pass, limiting the computation of osculating elements to only one point per pass. The numerical integration method which is fourth order in delta t in the cumulative error after a given time lapse is presented. Algorithms are explained and a flowchart and listing of the program are provided.
Predicting breast screening attendance using machine learning techniques.

Science.gov (United States)

Baskaran, Vikraman; Guergachi, Aziz; Bali, Rajeev K; Naguib, Raouf N G

2011-03-01

Machine learning-based prediction has been effectively applied for many healthcare applications. Predicting breast screening attendance using machine learning (prior to the actual mammogram) is a new field. This paper presents new predictor attributes for such an algorithm. It describes a new hybrid algorithm that relies on back-propagation and radial basis function-based neural networks for prediction. The algorithm has been developed in an open source-based environment. The algorithm was tested on a 13-year dataset (1995-2008). This paper compares the algorithm and validates its accuracy and efficiency with different platforms. Nearly 80% accuracy and 88% positive predictive value and sensitivity were recorded for the algorithm. The results were encouraging; 40-50% of negative predictive value and specificity warrant further work. Preliminary results were promising and provided ample amount of reasons for testing the algorithm on a larger scale.

Algorithms and Methods for High-Performance Model Predictive Control

DEFF Research Database (Denmark)

Frison, Gianluca

routines employed in the numerical tests. The main focus of this thesis is on linear MPC problems. In this thesis, both the algorithms and their implementation are equally important. About the implementation, a novel implementation strategy for the dense linear algebra routines in embedded optimization...... is proposed, aiming at improving the computational performance in case of small matrices. About the algorithms, they are built on top of the proposed linear algebra, and they are tailored to exploit the high-level structure of the MPC problems, with special care on reducing the computational complexity....
REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH

Directory of Open Access Journals (Sweden)

R. SATISHKUMAR

2017-01-01

Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.
Higher-spin cluster algorithms: the Heisenberg spin and U(1) quantum link models

Energy Technology Data Exchange (ETDEWEB)

Chudnovsky, V

2000-03-01

I discuss here how the highly-efficient spin-1/2 cluster algorithm for the Heisenberg antiferromagnet may be extended to higher-dimensional representations; some numerical results are provided. The same extensions can be used for the U(1) flux cluster algorithm, but have not yielded signals of the desired Coulomb phase of the system.
Higher-spin cluster algorithms: the Heisenberg spin and U(1) quantum link models

International Nuclear Information System (INIS)

Chudnovsky, V.

2000-01-01

I discuss here how the highly-efficient spin-1/2 cluster algorithm for the Heisenberg antiferromagnet may be extended to higher-dimensional representations; some numerical results are provided. The same extensions can be used for the U(1) flux cluster algorithm, but have not yielded signals of the desired Coulomb phase of the system
Cy-preds: An algorithm and a web service for the analysis and prediction of cysteine reactivity.

Science.gov (United States)

Soylu, İnanç; Marino, Stefano M

2016-02-01

Cysteine (Cys) is a critically important amino acid, serving a variety of functions within proteins including structural roles, catalysis, and regulation of function through post-translational modifications. Predicting which Cys residues are likely to be reactive is a very sought after feature. Few methods are currently available for the task, either based on evaluation of physicochemical features (e.g., pKa and exposure) or based on similarity with known instances. In this study, we developed an algorithm (named HAL-Cy) which blends previous work with novel implementations to identify reactive Cys from nonreactive. HAL-Cy present two major components: (i) an energy based part, rooted on the evaluation of H-bond network contributions and (ii) a knowledge based part, composed of different profiling approaches (including a newly developed weighting matrix for sequence profiling). In our evaluations, HAL-Cy provided significantly improved performances, as tested in comparisons with existing approaches. We implemented our algorithm in a web service (Cy-preds), the ultimate product of our work; we provided it with a variety of additional features, tools, and options: Cy-preds is capable of performing fully automated calculations for a thorough analysis of Cys reactivity in proteins, ranging from reactivity predictions (e.g., with HAL-Cy) to functional characterization. We believe it represents an original, effective, and very useful addition to the current array of tools available to scientists involved in redox biology, Cys biochemistry, and structural bioinformatics. © 2015 Wiley Periodicals, Inc.
Prediction of China's coal production-environmental pollution based on a hybrid genetic algorithm-system dynamics model

International Nuclear Information System (INIS)

Yu Shiwei; Wei Yiming

2012-01-01

This paper proposes a hybrid model based on genetic algorithm (GA) and system dynamics (SD) for coal production–environmental pollution load in China. GA has been utilized in the optimization of the parameters of the SD model to reduce implementation subjectivity. The chain of “Economic development–coal demand–coal production–environmental pollution load” of China in 2030 was predicted, and scenarios were analyzed. Results show that: (1) GA performs well in optimizing the parameters of the SD model objectively and in simulating the historical data; (2) The demand for coal energy continuously increases, although the coal intensity has actually decreased because of China's persistent economic development. Furthermore, instead of reaching a turning point by 2030, the environmental pollution load continuously increases each year even under the scenario where coal intensity decreased by 20% and investment in pollution abatement increased by 20%; (3) For abating the amount of “three types of wastes”, reducing the coal intensity is more effective than reducing the polluted production per tonne of coal and increasing investment in pollution control. - Highlights: ► We propos a GA-SD model for China's coal production-pollution prediction. ► Genetic algorithm (GA) can objectively and accurately optimize parameters of system dynamics (SD) model. ► Environmental pollution in China is projected to grow in our scenarios by 2030. ► The mechanism of reducing waste production per tonne of coal mining is more effective than others.
False-nearest-neighbors algorithm and noise-corrupted time series

International Nuclear Information System (INIS)

Rhodes, C.; Morari, M.

1997-01-01

The false-nearest-neighbors (FNN) algorithm was originally developed to determine the embedding dimension for autonomous time series. For noise-free computer-generated time series, the algorithm does a good job in predicting the embedding dimension. However, the problem of predicting the embedding dimension when the time-series data are corrupted by noise was not fully examined in the original studies of the FNN algorithm. Here it is shown that with large data sets, even small amounts of noise can lead to incorrect prediction of the embedding dimension. Surprisingly, as the length of the time series analyzed by FNN grows larger, the cause of incorrect prediction becomes more pronounced. An analysis of the effect of noise on the FNN algorithm and a solution for dealing with the effects of noise are given here. Some results on the theoretically correct choice of the FNN threshold are also presented. copyright 1997 The American Physical Society
A rapid learning and dynamic stepwise updating algorithm for flat neural networks and the application to time-series prediction.

Science.gov (United States)

Chen, C P; Wan, J Z

1999-01-01

A fast learning algorithm is proposed to find an optimal weights of the flat neural networks (especially, the functional-link network). Although the flat networks are used for nonlinear function approximation, they can be formulated as linear systems. Thus, the weights of the networks can be solved easily using a linear least-square method. This formulation makes it easier to update the weights instantly for both a new added pattern and a new added enhancement node. A dynamic stepwise updating algorithm is proposed to update the weights of the system on-the-fly. The model is tested on several time-series data including an infrared laser data set, a chaotic time-series, a monthly flour price data set, and a nonlinear system identification problem. The simulation results are compared to existing models in which more complex architectures and more costly training are needed. The results indicate that the proposed model is very attractive to real-time processes.
A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. 1: Theoretical development and application to yearly predictions for selected cities in the United States

Science.gov (United States)

Manning, Robert M.

1986-01-01

A rain attenuation prediction model is described for use in calculating satellite communication link availability for any specific location in the world that is characterized by an extended record of rainfall. Such a formalism is necessary for the accurate assessment of such availability predictions in the case of the small user-terminal concept of the Advanced Communication Technology Satellite (ACTS) Project. The model employs the theory of extreme value statistics to generate the necessary statistical rainrate parameters from rain data in the form compiled by the National Weather Service. These location dependent rain statistics are then applied to a rain attenuation model to obtain a yearly prediction of the occurrence of attenuation on any satellite link at that location. The predictions of this model are compared to those of the Crane Two-Component Rain Model and some empirical data and found to be very good. The model is then used to calculate rain attenuation statistics at 59 locations in the United States (including Alaska and Hawaii) for the 20 GHz downlinks and 30 GHz uplinks of the proposed ACTS system. The flexibility of this modeling formalism is such that it allows a complete and unified treatment of the temporal aspects of rain attenuation that leads to the design of an optimum stochastic power control algorithm, the purpose of which is to efficiently counter such rain fades on a satellite link.
Linking CDOM spectral absorption to dissolved organic carbon concentrations and loadings in boreal estuaries

DEFF Research Database (Denmark)

Asmala, Eero; Stedmon, Colin A.; Thomas, David N.

2012-01-01

concentrations across the salinity gradient and ranged from 1.67 to 33.4 m−1. The link between DOC and CDOM was studied using a range of wavelengths and algorithms. Wavelengths between 250 and 270 nm gave the best predictions with single linear regression. Total dissolved iron was found to influence......The quantity of chromophoric dissolved organic matter (CDOM) and dissolved organic carbon (DOC) in three Finnish estuaries (Karjaanjoki, Kyrönjoki and Kiiminkijoki) was investigated, with respect to predicting DOC concentrations and loadings from spectral CDOM absorption measurements. Altogether 87...... the prediction in wavelengths above 520nm. Despite significant seasonal and spatial differences in DOC–CDOM models, a universal relationship was tested with an independent data set and found to be robust. DOC and CDOM yields (loading/catchment area) from the catchments ranged from 1.98 to 5.44gCm−2yr−1, and 1...
Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

OpenAIRE

Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

1984-01-01

We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to ...
Using Tree Detection Algorithms to Predict Stand Sapwood Area, Basal Area and Stocking Density in Eucalyptus regnans Forest

Directory of Open Access Journals (Sweden)

Dominik Jaskierniak

2015-06-01

Full Text Available Managers of forested water supply catchments require efficient and accurate methods to quantify changes in forest water use due to changes in forest structure and density after disturbance. Using Light Detection and Ranging (LiDAR data with as few as 0.9 pulses m−2, we applied a local maximum filtering (LMF method and normalised cut (NCut algorithm to predict stocking density (SDen of a 69-year-old Eucalyptus regnans forest comprising 251 plots with resolution of the order of 0.04 ha. Using the NCut method we predicted basal area (BAHa per hectare and sapwood area (SAHa per hectare, a well-established proxy for transpiration. Sapwood area was also indirectly estimated with allometric relationships dependent on LiDAR derived SDen and BAHa using a computationally efficient procedure. The individual tree detection (ITD rates for the LMF and NCut methods respectively had 72% and 68% of stems correctly identified, 25% and 20% of stems missed, and 2% and 12% of stems over-segmented. The significantly higher computational requirement of the NCut algorithm makes the LMF method more suitable for predicting SDen across large forested areas. Using NCut derived ITD segments, observed versus predicted stand BAHa had R2 ranging from 0.70 to 0.98 across six catchments, whereas a generalised parsimonious model applied to all sites used the portion of hits greater than 37 m in height (PH37 to explain 68% of BAHa. For extrapolating one ha resolution SAHa estimates across large forested catchments, we found that directly relating SAHa to NCut derived LiDAR indices (R2 = 0.56 was slightly more accurate but computationally more demanding than indirect estimates of SAHa using allometric relationships consisting of BAHa (R2 = 0.50 or a sapwood perimeter index, defined as (BAHaSDen½ (R2 = 0.48.
A comprehensive comparison of comparative RNA structure prediction approaches

DEFF Research Database (Denmark)

Gardner, P. P.; Giegerich, R.

2004-01-01

-finding and multiple-sequence-alignment algorithms. Results Here we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance. Conclusions We conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms......Background An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene...
DIANA-microT web server: elucidating microRNA functions through target prediction.

Science.gov (United States)

Maragkakis, M; Reczko, M; Simossis, V A; Alexiou, P; Papadopoulos, G L; Dalamagas, T; Giannopoulos, G; Goumas, G; Koukis, E; Kourtis, K; Vergoulis, T; Koziris, N; Sellis, T; Tsanakas, P; Hatzigeorgiou, A G

2009-07-01

Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT.
Dynamic Algorithm for LQGPC Predictive Control

DEFF Research Database (Denmark)

Hangstrup, M.; Ordys, A.W.; Grimble, M.J.

1998-01-01

In this paper the optimal control law is derived for a multi-variable state space Linear Quadratic Gaussian Predictive Controller (LQGPC). A dynamic performance index is utilized resulting in an optimal steady state controller. Knowledge of future reference values is incorporated into the control......In this paper the optimal control law is derived for a multi-variable state space Linear Quadratic Gaussian Predictive Controller (LQGPC). A dynamic performance index is utilized resulting in an optimal steady state controller. Knowledge of future reference values is incorporated...... into the controller design and the solution is derived using the method of Lagrange multipliers. It is shown how well-known GPC controller can be obtained as a special case of the LQGPC controller design. The important advantage of using the LQGPC framework for designing predictive, e.g. GPS is that LQGPC enables...
Research on E-Commerce Platform-Based Personalized Recommendation Algorithm

Directory of Open Access Journals (Sweden)

Zhijun Zhang

2016-01-01

Full Text Available Aiming at data sparsity and timeliness in traditional E-commerce collaborative filtering recommendation algorithms, when constructing user-item rating matrix, this paper utilizes the feature that commodities in E-commerce system belong to different levels to fill in nonrated items by calculating RF/IRF of the commodity’s corresponding level. In the recommendation prediction stage, considering timeliness of the recommendation system, time weighted based recommendation prediction formula is adopted to design a personalized recommendation model by integrating level filling method and rating time. The experimental results on real dataset verify the feasibility and validity of the algorithm and it owns higher predicting accuracy compared with present recommendation algorithms.
Multi-link faults localization and restoration based on fuzzy fault set for dynamic optical networks.

Science.gov (United States)

Zhao, Yongli; Li, Xin; Li, Huadong; Wang, Xinbo; Zhang, Jie; Huang, Shanguo

2013-01-28

Based on a distributed method of bit-error-rate (BER) monitoring, a novel multi-link faults restoration algorithm is proposed for dynamic optical networks. The concept of fuzzy fault set (FFS) is first introduced for multi-link faults localization, which includes all possible optical equipment or fiber links with a membership describing the possibility of faults. Such a set is characterized by a membership function which assigns each object a grade of membership ranging from zero to one. OSPF protocol extension is designed for the BER information flooding in the network. The BER information can be correlated to link faults through FFS. Based on the BER information and FFS, multi-link faults localization mechanism and restoration algorithm are implemented and experimentally demonstrated on a GMPLS enabled optical network testbed with 40 wavelengths in each fiber link. Experimental results show that the novel localization mechanism has better performance compared with the extended limited perimeter vector matching (LVM) protocol and the restoration algorithm can improve the restoration success rate under multi-link faults scenario.
SU-E-T-516: Dosimetric Validation of AcurosXB Algorithm in Comparison with AAA & CCC Algorithms for VMAT Technique.

Science.gov (United States)

Kathirvel, M; Subramanian, V Sai; Arun, G; Thirumalaiswamy, S; Ramalingam, K; Kumar, S Ashok; Jagadeesh, K

2012-06-01

To dosimetrically validate AcurosXB algorithm for Volumetric Modulated Arc Therapy (VMAT) in comparison with standard clinical Anisotropic Analytic Algorithm(AAA) and Collapsed Cone Convolution(CCC) dose calculation algorithms. AcurosXB dose calculation algorithm is available with Varian Eclipse treatment planning system (V10). It uses grid-based Boltzmann equation solver to predict dose precisely in lesser time. This study was made to realize algorithms ability to predict dose accurately as its delivery for which five clinical cases each of Brain, Head&Neck, Thoracic, Pelvic and SBRT were taken. Verification plans were created on multicube phantom with iMatrixx-2D detector array and then dose prediction was done with AcurosXB, AAA & CCC (COMPASS System) algorithm and the same were delivered onto CLINAC-iX treatment machine. Delivered dose was captured in iMatrixx plane for all 25 plans. Measured dose was taken as reference to quantify the agreement between AcurosXB calculation algorithm against previously validated AAA and CCC algorithm. Gamma evaluation was performed with clinical criteria distance-to-agreement 3&2mm and dose difference 3&2% in omnipro-I'MRT software. Plans were evaluated in terms of correlation coefficient, quantitative area gamma and average gamma. Study shows good agreement between mean correlation 0.9979±0.0012, 0.9984±0.0009 & 0.9979±0.0011 for AAA, CCC & Acuros respectively. Mean area gamma for criteria 3mm/3% was found to be 98.80±1.04, 98.14±2.31, 98.08±2.01 and 2mm/2% was found to be 93.94±3.83, 87.17±10.54 & 92.36±5.46 for AAA, CCC & Acuros respectively. Mean average gamma for 3mm/3% was 0.26±0.07, 0.42±0.08, 0.28±0.09 and 2mm/2% was found to be 0.39±0.10, 0.64±0.11, 0.42±0.13 for AAA, CCC & Acuros respectively. This study demonstrated that the AcurosXB algorithm had a good agreement with the AAA & CCC in terms of dose prediction. In conclusion AcurosXB algorithm provides a valid, accurate and speedy alternative to AAA
The performance of the backpropagation algorithm with varying slope of the activation function

International Nuclear Information System (INIS)

Bai Yanping; Zhang Haixia; Hao Yilong

2009-01-01

Some adaptations are proposed to the basic BP algorithm in order to provide an efficient method to non-linear data learning and prediction. In this paper, an adopted BP algorithm with varying slope of activation function and different learning rates is put forward. The results of experiment indicated that this algorithm can get very good performance of training. We also test the prediction performance of our adopted BP algorithm on 16 instances. We compared the test results to the ones of the BP algorithm with gradient descent momentum and an adaptive learning rate. The results indicate this adopted BP algorithm gives best performance (100%) for test example, which conclude this adopted BP algorithm produces a smoothed reconstruction that learns better to new prediction function values than the BP algorithm improved with momentum.
Daylighting simulation: methods, algorithms, and resources

Energy Technology Data Exchange (ETDEWEB)

Carroll, William L.

1999-12-01

This document presents work conducted as part of Subtask C, ''Daylighting Design Tools'', Subgroup C2, ''New Daylight Algorithms'', of the IEA SHC Task 21 and the ECBCS Program Annex 29 ''Daylight in Buildings''. The search for and collection of daylighting analysis methods and algorithms led to two important observations. First, there is a wide range of needs for different types of methods to produce a complete analysis tool. These include: Geometry; Light modeling; Characterization of the natural illumination resource; Materials and components properties, representations; and Usability issues (interfaces, interoperability, representation of analysis results, etc). Second, very advantageously, there have been rapid advances in many basic methods in these areas, due to other forces. They are in part driven by: The commercial computer graphics community (commerce, entertainment); The lighting industry; Architectural rendering and visualization for projects; and Academia: Course materials, research. This has led to a very rich set of information resources that have direct applicability to the small daylighting analysis community. Furthermore, much of this information is in fact available online. Because much of the information about methods and algorithms is now online, an innovative reporting strategy was used: the core formats are electronic, and used to produce a printed form only secondarily. The electronic forms include both online WWW pages and a downloadable .PDF file with the same appearance and content. Both electronic forms include live primary and indirect links to actual information sources on the WWW. In most cases, little additional commentary is provided regarding the information links or citations that are provided. This in turn allows the report to be very concise. The links are expected speak for themselves. The report consists of only about 10+ pages, with about 100+ primary links, but

Development and validation of algorithms to differentiate ductal carcinoma in situ from invasive breast cancer within administrative claims data.

Science.gov (United States)

Hirth, Jacqueline M; Hatch, Sandra S; Lin, Yu-Li; Giordano, Sharon H; Silva, H Colleen; Kuo, Yong-Fang

2018-04-18

Overtreatment is a common concern for patients with ductal carcinoma in situ (DCIS), but this entity is difficult to distinguish from invasive breast cancers in administrative claims data sets because DCIS often is coded as invasive breast cancer. Therefore, the authors developed and validated algorithms to select DCIS cases from administrative claims data to enable outcomes research in this type of data. This retrospective cohort using invasive breast cancer and DCIS cases included women aged 66 to 70 years in the 2004 through 2011 Texas Cancer Registry (TCR) data linked to Medicare administrative claims data. TCR records were used as "gold" standards to evaluate the sensitivity, specificity, and positive predictive value (PPV) of 2 algorithms. Women with a biopsy enrolled in Medicare parts A and B at 12 months before and 6 months after their first biopsy without a second incident diagnosis of DCIS or invasive breast cancer within 12 months in the TCR were included. Women in 2010 Medicare data were selected to test the algorithms in a general sample. In the TCR data set, a total of 6907 cases met inclusion criteria, with 1244 DCIS cases. The first algorithm had a sensitivity of 79%, a specificity of 89%, and a PPV of 62%. The second algorithm had a sensitivity of 50%, a specificity of 97%. and a PPV of 77%. Among women in the general sample, the specificity was high and the sensitivity was similar for both algorithms. However, the PPV was approximately 6% to 7% lower. DCIS frequently is miscoded as invasive breast cancer, and thus the proposed algorithms are useful to examine DCIS outcomes using data sets not linked to cancer registries. Cancer 2018. © 2018 American Cancer Society. © 2018 American Cancer Society.
A new algorithm for coding geological terminology

Science.gov (United States)

Apon, W.

The Geological Survey of The Netherlands has developed an algorithm to convert the plain geological language of lithologic well logs into codes suitable for computer processing and link these to existing plotting programs. The algorithm is based on the "direct method" and operates in three steps: (1) searching for defined word combinations and assigning codes; (2) deleting duplicated codes; (3) correcting incorrect code combinations. Two simple auxiliary files are used. A simple PC demonstration program is included to enable readers to experiment with this algorithm. The Department of Quarternary Geology of the Geological Survey of The Netherlands possesses a large database of shallow lithologic well logs in plain language and has been using a program based on this algorithm for about 3 yr. Erroneous codes resulting from using this algorithm are less than 2%.
Prediction of severe retinopathy of prematurity using the WINROP algorithm in a cohort from Malopolska. A retrospective, single-center study.

Science.gov (United States)

Jagła, Mateusz; Peterko, Anna; Olesińska, Katarzyna; Szymońska, Izabela; Kwinta, Przemko

2017-01-01

Retinopathy of prematurity (ROP) is one of the leading avoidable causes of blindness in childhood in developed countries. Accurate diagnosis and treatment are essential for preventing the loss of vision. WINROP (https://www.winrop.com) is an online monitoring system which predicts the risk for ROP requiring treatment based on gestational age, birth weight, and body weight gain. To validate diagnostic accuracy of the WINROP algorithm for the detection of severe ROP in a single centre cohort of Polish, high-risk preterm infant population. Medical records of neonates born before 32 weeks of gestation admitted to the third level neonatal centre in a 2-year retrospective investigation 79 patients were included in the study: their gestational age, birth weight and body weight gain were set in the WINROP system. The algorithm evaluated the risk for ROP divided into low or high-risk of disease and identified infants with high risk of developing severe ROP (type 1 ROP). Out of 79 patients 37 received a high-risk alarm, of whom 22 developed severe ROP. Low-risk alarm was triggered in 42 infants; five of them developed type 1 ROP. The sensitivity of the WINROP was found to be 81.5% (95% CI 61.9-93.7), specificity 71.2% (95% CI 56.9-82.9), negative predictive value (NPV) 88.1% (95% CI 76.7-94.3), and positive predictive value (PPV) 59.5 (95% CI 48.1-69.9), respectively. The accuracy of the test significantly increased after combined WINROP and surfactant therapy as an additional factor - sensitivity 96.3% (95% CI 81.0-99.9), specificity 63.5% (95% CI 49.0-76.4), NPV 97.1% (95% CI 82.3-99.6), and PPV 57.8 (95% CI 48.7-66.4). The WINROP algorithm sensitivity from the Polish cohort was not as high as that reported in developed countries. However, combined with additional factors (e.g. surfactant treatment) it can be useful for identifying the risk groups of sight-threatening ROP. The accuracy of the WINROP algorithm should be validated in a large multi-center prospective study in
Efficient RNA structure comparison algorithms.

Science.gov (United States)

Arslan, Abdullah N; Anandan, Jithendar; Fry, Eric; Monschke, Keith; Ganneboina, Nitin; Bowerman, Jason

2017-12-01

Recently proposed relative addressing-based ([Formula: see text]) RNA secondary structure representation has important features by which an RNA structure database can be stored into a suffix array. A fast substructure search algorithm has been proposed based on binary search on this suffix array. Using this substructure search algorithm, we present a fast algorithm that finds the largest common substructure of given multiple RNA structures in [Formula: see text] format. The multiple RNA structure comparison problem is NP-hard in its general formulation. We introduced a new problem for comparing multiple RNA structures. This problem has more strict similarity definition and objective, and we propose an algorithm that solves this problem efficiently. We also develop another comparison algorithm that iteratively calls this algorithm to locate nonoverlapping large common substructures in compared RNAs. With the new resulting tools, we improved the RNASSAC website (linked from http://faculty.tamuc.edu/aarslan ). This website now also includes two drawing tools: one specialized for preparing RNA substructures that can be used as input by the search tool, and another one for automatically drawing the entire RNA structure from a given structure sequence.
Prediction of Flood Warning in Taiwan Using Nonlinear SVM with Simulated Annealing Algorithm

Science.gov (United States)

Lee, C.

2013-12-01

The issue of the floods is important in Taiwan. It is because the narrow and high topography of the island make lots of rivers steep in Taiwan. The tropical depression likes typhoon always causes rivers to flood. Prediction of river flow under the extreme rainfall circumstances is important for government to announce the warning of flood. Every time typhoon passed through Taiwan, there were always floods along some rivers. The warning is classified to three levels according to the warning water levels in Taiwan. The propose of this study is to predict the level of floods warning from the information of precipitation, rainfall duration and slope of riverbed. To classify the level of floods warning by the above-mentioned information and modeling the problems, a machine learning model, nonlinear Support vector machine (SVM), is formulated to classify the level of floods warning. In addition, simulated annealing (SA), a probabilistic heuristic algorithm, is used to determine the optimal parameter of the SVM model. A case study of flooding-trend rivers of different gradients in Taiwan is conducted. The contribution of this SVM model with simulated annealing is capable of making efficient announcement for flood warning and keeping the danger of flood from residents along the rivers.
Analog Circuit Design Optimization Based on Evolutionary Algorithms

Directory of Open Access Journals (Sweden)

Mansour Barari

2014-01-01

Full Text Available This paper investigates an evolutionary-based designing system for automated sizing of analog integrated circuits (ICs. Two evolutionary algorithms, genetic algorithm and PSO (Parswal particle swarm optimization algorithm, are proposed to design analog ICs with practical user-defined specifications. On the basis of the combination of HSPICE and MATLAB, the system links circuit performances, evaluated through specific electrical simulation, to the optimization system in the MATLAB environment, for the selected topology. The system has been tested by typical and hard-to-design cases, such as complex analog blocks with stringent design requirements. The results show that the design specifications are closely met. Comparisons with available methods like genetic algorithms show that the proposed algorithm offers important advantages in terms of optimization quality and robustness. Moreover, the algorithm is shown to be efficient.
Multiparametric classification links tumor microenvironments with tumor cell phenotype.

Directory of Open Access Journals (Sweden)

Bojana Gligorijevic

2014-11-01

Full Text Available While it has been established that a number of microenvironment components can affect the likelihood of metastasis, the link between microenvironment and tumor cell phenotypes is poorly understood. Here we have examined microenvironment control over two different tumor cell motility phenotypes required for metastasis. By high-resolution multiphoton microscopy of mammary carcinoma in mice, we detected two phenotypes of motile tumor cells, different in locomotion speed. Only slower tumor cells exhibited protrusions with molecular, morphological, and functional characteristics associated with invadopodia. Each region in the primary tumor exhibited either fast- or slow-locomotion. To understand how the tumor microenvironment controls invadopodium formation and tumor cell locomotion, we systematically analyzed components of the microenvironment previously associated with cell invasion and migration. No single microenvironmental property was able to predict the locations of tumor cell phenotypes in the tumor if used in isolation or combined linearly. To solve this, we utilized the support vector machine (SVM algorithm to classify phenotypes in a nonlinear fashion. This approach identified conditions that promoted either motility phenotype. We then demonstrated that varying one of the conditions may change tumor cell behavior only in a context-dependent manner. In addition, to establish the link between phenotypes and cell fates, we photoconverted and monitored the fate of tumor cells in different microenvironments, finding that only tumor cells in the invadopodium-rich microenvironments degraded extracellular matrix (ECM and disseminated. The number of invadopodia positively correlated with degradation, while the inhibiting metalloproteases eliminated degradation and lung metastasis, consistent with a direct link among invadopodia, ECM degradation, and metastasis. We have detected and characterized two phenotypes of motile tumor cells in vivo, which
An Alternative Route to Teaching Fraction Division: Abstraction of Common Denominator Algorithm

Science.gov (United States)

Zembat, Ismail Özgür

2015-01-01

From a curricular stand point, the traditional invert and multiply algorithm for division of fractions provides few affordances for linking to a rich understanding of fractions. On the other hand, an alternative algorithm, called common denominator algorithm, has many such affordances. The current study serves as an argument for shifting…
Natural speech algorithm applied to baseline interview data can predict which patients will respond to psilocybin for treatment-resistant depression.

Science.gov (United States)

Carrillo, Facundo; Sigman, Mariano; Fernández Slezak, Diego; Ashton, Philip; Fitzgerald, Lily; Stroud, Jack; Nutt, David J; Carhart-Harris, Robin L

2018-04-01

Natural speech analytics has seen some improvements over recent years, and this has opened a window for objective and quantitative diagnosis in psychiatry. Here, we used a machine learning algorithm applied to natural speech to ask whether language properties measured before psilocybin for treatment-resistant can predict for which patients it will be effective and for which it will not. A baseline autobiographical memory interview was conducted and transcribed. Patients with treatment-resistant depression received 2 doses of psilocybin, 10 mg and 25 mg, 7 days apart. Psychological support was provided before, during and after all dosing sessions. Quantitative speech measures were applied to the interview data from 17 patients and 18 untreated age-matched healthy control subjects. A machine learning algorithm was used to classify between controls and patients and predict treatment response. Speech analytics and machine learning successfully differentiated depressed patients from healthy controls and identified treatment responders from non-responders with a significant level of 85% of accuracy (75% precision). Automatic natural language analysis was used to predict effective response to treatment with psilocybin, suggesting that these tools offer a highly cost-effective facility for screening individuals for treatment suitability and sensitivity. The sample size was small and replication is required to strengthen inferences on these results. Copyright © 2018 Elsevier B.V. All rights reserved.
A similarity based agglomerative clustering algorithm in networks

Science.gov (United States)

Liu, Zhiyuan; Wang, Xiujuan; Ma, Yinghong

2018-04-01

The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node's importance in networks. Therefore, our proposed method can better exploit the nodes' properties and network's structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.
Fast algorithm for Morphological Filters

International Nuclear Information System (INIS)

Lou Shan; Jiang Xiangqian; Scott, Paul J

2011-01-01

In surface metrology, morphological filters, which evolved from the envelope filtering system (E-system) work well for functional prediction of surface finish in the analysis of surfaces in contact. The naive algorithms are time consuming, especially for areal data, and not generally adopted in real practice. A fast algorithm is proposed based on the alpha shape. The hull obtained by rolling the alpha ball is equivalent to the morphological opening/closing in theory. The algorithm depends on Delaunay triangulation with time complexity O(nlogn). In comparison to the naive algorithms it generates the opening and closing envelope without combining dilation and erosion. Edge distortion is corrected by reflective padding for open profiles/surfaces. Spikes in the sample data are detected and points interpolated to prevent singularities. The proposed algorithm works well both for morphological profile and area filters. Examples are presented to demonstrate the validity and superiority on efficiency of this algorithm over the naive algorithm.
Overlapping community detection based on link graph using distance dynamics

Science.gov (United States)

Chen, Lei; Zhang, Jing; Cai, Li-Jun

2018-01-01

The distance dynamics model was recently proposed to detect the disjoint community of a complex network. To identify the overlapping structure of a network using the distance dynamics model, an overlapping community detection algorithm, called L-Attractor, is proposed in this paper. The process of L-Attractor mainly consists of three phases. In the first phase, L-Attractor transforms the original graph to a link graph (a new edge graph) to assure that one node has multiple distances. In the second phase, using the improved distance dynamics model, a dynamic interaction process is introduced to simulate the distance dynamics (shrink or stretch). Through the dynamic interaction process, all distances converge, and the disjoint community structure of the link graph naturally manifests itself. In the third phase, a recovery method is designed to convert the disjoint community structure of the link graph to the overlapping community structure of the original graph. Extensive experiments are conducted on the LFR benchmark networks as well as real-world networks. Based on the results, our algorithm demonstrates higher accuracy and quality than other state-of-the-art algorithms.
A hybrid genetic algorithm and linear regression for prediction of NOx emission in power generation plant

International Nuclear Information System (INIS)

Bunyamin, Muhammad Afif; Yap, Keem Siah; Aziz, Nur Liyana Afiqah Abdul; Tiong, Sheih Kiong; Wong, Shen Yuong; Kamal, Md Fauzan

2013-01-01

This paper presents a new approach of gas emission estimation in power generation plant using a hybrid Genetic Algorithm (GA) and Linear Regression (LR) (denoted as GA-LR). The LR is one of the approaches that model the relationship between an output dependant variable, y, with one or more explanatory variables or inputs which denoted as x. It is able to estimate unknown model parameters from inputs data. On the other hand, GA is used to search for the optimal solution until specific criteria is met causing termination. These results include providing good solutions as compared to one optimal solution for complex problems. Thus, GA is widely used as feature selection. By combining the LR and GA (GA-LR), this new technique is able to select the most important input features as well as giving more accurate prediction by minimizing the prediction errors. This new technique is able to produce more consistent of gas emission estimation, which may help in reducing population to the environment. In this paper, the study's interest is focused on nitrous oxides (NOx) prediction. The results of the experiment are encouraging.
Study on improved Ip-iq APF control algorithm and its application in micro grid

Science.gov (United States)

Xie, Xifeng; Shi, Hua; Deng, Haiyingv

2018-01-01

In order to enhance the tracking velocity and accuracy of harmonic detection by ip-iq algorithm, a novel ip-iq control algorithm based on the Instantaneous reactive power theory is presented, the improved algorithm adds the lead correction link to adjust the zero point of the detection system, the Fuzzy Self-Tuning Adaptive PI control is introduced to dynamically adjust the DC-link Voltage, which meets the requirement of the harmonic compensation of the micro grid. Simulation and experimental results verify the proposed method is feasible and effective in micro grid.
Performance of Machine Learning Algorithms for Qualitative and Quantitative Prediction Drug Blockade of hERG1 channel.

Science.gov (United States)

Wacker, Soren; Noskov, Sergei Yu

2018-05-01

Drug-induced abnormal heart rhythm known as Torsades de Pointes (TdP) is a potential lethal ventricular tachycardia found in many patients. Even newly released anti-arrhythmic drugs, like ivabradine with HCN channel as a primary target, block the hERG potassium current in overlapping concentration interval. Promiscuous drug block to hERG channel may potentially lead to perturbation of the action potential duration (APD) and TdP, especially when with combined with polypharmacy and/or electrolyte disturbances. The example of novel anti-arrhythmic ivabradine illustrates clinically important and ongoing deficit in drug design and warrants for better screening methods. There is an urgent need to develop new approaches for rapid and accurate assessment of how drugs with complex interactions and multiple subcellular targets can predispose or protect from drug-induced TdP. One of the unexpected outcomes of compulsory hERG screening implemented in USA and European Union resulted in large datasets of IC 50 values for various molecules entering the market. The abundant data allows now to construct predictive machine-learning (ML) models. Novel ML algorithms and techniques promise better accuracy in determining IC 50 values of hERG blockade that is comparable or surpassing that of the earlier QSAR or molecular modeling technique. To test the performance of modern ML techniques, we have developed a computational platform integrating various workflows for quantitative structure activity relationship (QSAR) models using data from the ChEMBL database. To establish predictive powers of ML-based algorithms we computed IC 50 values for large dataset of molecules and compared it to automated patch clamp system for a large dataset of hERG blocking and non-blocking drugs, an industry gold standard in studies of cardiotoxicity. The optimal protocol with high sensitivity and predictive power is based on the novel eXtreme gradient boosting (XGBoost) algorithm. The ML-platform with XGBoost
Identifying paediatric nursing-sensitive outcomes in linked administrative health data

Directory of Open Access Journals (Sweden)

Wilson Sally

2012-07-01

Full Text Available Abstract Background There is increasing interest in the contribution of the quality of nursing care to patient outcomes. Due to different casemix and risk profiles, algorithms for administrative health data that identify nursing-sensitive outcomes in adult hospitalised patients may not be applicable to paediatric patients. The study purpose was to test adult algorithms in a paediatric hospital population and make amendments to increase the accuracy of identification of hospital acquired events. The study also aimed to determine whether the use of linked hospital records improved the likelihood of correctly identifying patient outcomes as nursing sensitive rather than being related to their pre-morbid conditions. Methods Using algorithms developed by Needleman et al. (2001, proportions and rates of records that identified nursing-sensitive outcomes for pressure ulcers, pneumonia and surgical wound infections were determined from administrative hospitalisation data for all paediatric patients discharged from a tertiary paediatric hospital in Western Australia between July 1999 and June 2009. The effects of changes to inclusion and exclusion criteria for each algorithm on the calculated proportion or rate in the paediatric population were explored. Linked records were used to identify comorbid conditions that increased nursing-sensitive outcome risk. Rates were calculated using algorithms revised for paediatric patients. Results Linked records of 129,719 hospital separations for 79,016 children were analysed. Identification of comorbid conditions was enhanced through access to prior and/or subsequent hospitalisation records (43% of children with pressure ulcers had a form of paralysis recorded only on a previous admission. Readmissions with a surgical wound infection were identified for 103 (4.8/1,000 surgical separations using linked data. After amendment of each algorithm for paediatric patients, rates of pressure ulcers and pneumonia reduced by
Identifying paediatric nursing-sensitive outcomes in linked administrative health data.

Science.gov (United States)

Wilson, Sally; Bremner, Alexandra P; Hauck, Yvonne; Finn, Judith

2012-07-20

There is increasing interest in the contribution of the quality of nursing care to patient outcomes. Due to different casemix and risk profiles, algorithms for administrative health data that identify nursing-sensitive outcomes in adult hospitalised patients may not be applicable to paediatric patients. The study purpose was to test adult algorithms in a paediatric hospital population and make amendments to increase the accuracy of identification of hospital acquired events. The study also aimed to determine whether the use of linked hospital records improved the likelihood of correctly identifying patient outcomes as nursing sensitive rather than being related to their pre-morbid conditions. Using algorithms developed by Needleman et al. (2001), proportions and rates of records that identified nursing-sensitive outcomes for pressure ulcers, pneumonia and surgical wound infections were determined from administrative hospitalisation data for all paediatric patients discharged from a tertiary paediatric hospital in Western Australia between July 1999 and June 2009. The effects of changes to inclusion and exclusion criteria for each algorithm on the calculated proportion or rate in the paediatric population were explored. Linked records were used to identify comorbid conditions that increased nursing-sensitive outcome risk. Rates were calculated using algorithms revised for paediatric patients. Linked records of 129,719 hospital separations for 79,016 children were analysed. Identification of comorbid conditions was enhanced through access to prior and/or subsequent hospitalisation records (43% of children with pressure ulcers had a form of paralysis recorded only on a previous admission). Readmissions with a surgical wound infection were identified for 103 (4.8/1,000) surgical separations using linked data. After amendment of each algorithm for paediatric patients, rates of pressure ulcers and pneumonia reduced by 53% and 15% (from 1.3 to 0.6 and from 9.1 to 7.7 per
An improved multi-domain convolution tracking algorithm

Science.gov (United States)

Sun, Xin; Wang, Haiying; Zeng, Yingsen

2018-04-01

Along with the wide application of the Deep Learning in the field of Computer vision, Deep learning has become a mainstream direction in the field of object tracking. The tracking algorithm in this paper is based on the improved multidomain convolution neural network, and the VOT video set is pre-trained on the network by multi-domain training strategy. In the process of online tracking, the network evaluates candidate targets sampled from vicinity of the prediction target in the previous with Gaussian distribution, and the candidate target with the highest score is recognized as the prediction target of this frame. The Bounding Box Regression model is introduced to make the prediction target closer to the ground-truths target box of the test set. Grouping-update strategy is involved to extract and select useful update samples in each frame, which can effectively prevent over fitting. And adapt to changes in both target and environment. To improve the speed of the algorithm while maintaining the performance, the number of candidate target succeed in adjusting dynamically with the help of Self-adaption parameter Strategy. Finally, the algorithm is tested by OTB set, compared with other high-performance tracking algorithms, and the plot of success rate and the accuracy are drawn. which illustrates outstanding performance of the tracking algorithm in this paper.
Control model design to limit DC-link voltage during grid fault in a dfig variable speed wind turbine

Science.gov (United States)

Nwosu, Cajethan M.; Ogbuka, Cosmas U.; Oti, Stephen E.

2017-08-01

This paper presents a control model design capable of inhibiting the phenomenal rise in the DC-link voltage during grid- fault condition in a variable speed wind turbine. Against the use of power circuit protection strategies with inherent limitations in fault ride-through capability, a control circuit algorithm capable of limiting the DC-link voltage rise which in turn bears dynamics that has direct influence on the characteristics of the rotor voltage especially during grid faults is here proposed. The model results so obtained compare favorably with the simulation results as obtained in a MATLAB/SIMULINK environment. The generated model may therefore be used to predict near accurately the nature of DC-link voltage variations during fault given some factors which include speed and speed mode of operation, the value of damping resistor relative to half the product of inner loop current control bandwidth and the filter inductance.
Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks

Directory of Open Access Journals (Sweden)

Haiyang Yu

2017-06-01

Full Text Available Predicting large-scale transportation network traffic has become an important and challenging topic in recent decades. Inspired by the domain knowledge of motion prediction, in which the future motion of an object can be predicted based on previous scenes, we propose a network grid representation method that can retain the fine-scale structure of a transportation network. Network-wide traffic speeds are converted into a series of static images and input into a novel deep architecture, namely, spatiotemporal recurrent convolutional networks (SRCNs, for traffic forecasting. The proposed SRCNs inherit the advantages of deep convolutional neural networks (DCNNs and long short-term memory (LSTM neural networks. The spatial dependencies of network-wide traffic can be captured by DCNNs, and the temporal dynamics can be learned by LSTMs. An experiment on a Beijing transportation network with 278 links demonstrates that SRCNs outperform other deep learning-based algorithms in both short-term and long-term traffic prediction.

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.

Science.gov (United States)

Yu, Haiyang; Wu, Zhihai; Wang, Shuqin; Wang, Yunpeng; Ma, Xiaolei

2017-06-26

Predicting large-scale transportation network traffic has become an important and challenging topic in recent decades. Inspired by the domain knowledge of motion prediction, in which the future motion of an object can be predicted based on previous scenes, we propose a network grid representation method that can retain the fine-scale structure of a transportation network. Network-wide traffic speeds are converted into a series of static images and input into a novel deep architecture, namely, spatiotemporal recurrent convolutional networks (SRCNs), for traffic forecasting. The proposed SRCNs inherit the advantages of deep convolutional neural networks (DCNNs) and long short-term memory (LSTM) neural networks. The spatial dependencies of network-wide traffic can be captured by DCNNs, and the temporal dynamics can be learned by LSTMs. An experiment on a Beijing transportation network with 278 links demonstrates that SRCNs outperform other deep learning-based algorithms in both short-term and long-term traffic prediction.
Can human experts predict solubility better than computers?

Science.gov (United States)

Boobier, Samuel; Osbourn, Anne; Mitchell, John B O

2017-12-13

In this study, we design and carry out a survey, asking human experts to predict the aqueous solubility of druglike organic compounds. We investigate whether these experts, drawn largely from the pharmaceutical industry and academia, can match or exceed the predictive power of algorithms. Alongside this, we implement 10 typical machine learning algorithms on the same dataset. The best algorithm, a variety of neural network known as a multi-layer perceptron, gave an RMSE of 0.985 log S units and an R 2 of 0.706. We would not have predicted the relative success of this particular algorithm in advance. We found that the best individual human predictor generated an almost identical prediction quality with an RMSE of 0.942 log S units and an R 2 of 0.723. The collection of algorithms contained a higher proportion of reasonably good predictors, nine out of ten compared with around half of the humans. We found that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median generated excellent predictivity. While our consensus human predictor achieved very slightly better headline figures on various statistical measures, the difference between it and the consensus machine learning predictor was both small and statistically insignificant. We conclude that human experts can predict the aqueous solubility of druglike molecules essentially equally well as machine learning algorithms. We find that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median is a powerful way of benefitting from the wisdom of crowds.
Applying Probability Theory for the Quality Assessment of a Wildfire Spread Prediction Framework Based on Genetic Algorithms

Directory of Open Access Journals (Sweden)

Andrés Cencerrado

2013-01-01

Full Text Available This work presents a framework for assessing how the existing constraints at the time of attending an ongoing forest fire affect simulation results, both in terms of quality (accuracy obtained and the time needed to make a decision. In the wildfire spread simulation and prediction area, it is essential to properly exploit the computational power offered by new computing advances. For this purpose, we rely on a two-stage prediction process to enhance the quality of traditional predictions, taking advantage of parallel computing. This strategy is based on an adjustment stage which is carried out by a well-known evolutionary technique: Genetic Algorithms. The core of this framework is evaluated according to the probability theory principles. Thus, a strong statistical study is presented and oriented towards the characterization of such an adjustment technique in order to help the operation managers deal with the two aspects previously mentioned: time and quality. The experimental work in this paper is based on a region in Spain which is one of the most prone to forest fires: El Cap de Creus.
A Novel Approach for Blast-Induced Flyrock Prediction Based on Imperialist Competitive Algorithm and Artificial Neural Network

Science.gov (United States)

Marto, Aminaton; Jahed Armaghani, Danial; Tonnizam Mohamad, Edy; Makhtar, Ahmad Mahir

2014-01-01

Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA) and artificial neural network (ANN). For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP) ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches. PMID:25147856
A Novel Approach for Blast-Induced Flyrock Prediction Based on Imperialist Competitive Algorithm and Artificial Neural Network

Directory of Open Access Journals (Sweden)

Aminaton Marto

2014-01-01

Full Text Available Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA and artificial neural network (ANN. For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches.
Customer demand prediction of service-oriented manufacturing using the least square support vector machine optimized by particle swarm optimization algorithm

Science.gov (United States)

Cao, Jin; Jiang, Zhibin; Wang, Kangzhou

2017-07-01

Many nonlinear customer satisfaction-related factors significantly influence the future customer demand for service-oriented manufacturing (SOM). To address this issue and enhance the prediction accuracy, this article develops a novel customer demand prediction approach for SOM. The approach combines the phase space reconstruction (PSR) technique with the optimized least square support vector machine (LSSVM). First, the prediction sample space is reconstructed by the PSR to enrich the time-series dynamics of the limited data sample. Then, the generalization and learning ability of the LSSVM are improved by the hybrid polynomial and radial basis function kernel. Finally, the key parameters of the LSSVM are optimized by the particle swarm optimization algorithm. In a real case study, the customer demand prediction of an air conditioner compressor is implemented. Furthermore, the effectiveness and validity of the proposed approach are demonstrated by comparison with other classical predication approaches.
Predictability and Prediction for an Experimental Cultural Market

Science.gov (United States)

Colbaugh, Richard; Glass, Kristin; Ormerod, Paul

Individuals are often influenced by the behavior of others, for instance because they wish to obtain the benefits of coordinated actions or infer otherwise inaccessible information. In such situations this social influence decreases the ex ante predictability of the ensuing social dynamics. We claim that, interestingly, these same social forces can increase the extent to which the outcome of a social process can be predicted very early in the process. This paper explores this claim through a theoretical and empirical analysis of the experimental music market described and analyzed in [1]. We propose a very simple model for this music market, assess the predictability of market outcomes through formal analysis of the model, and use insights derived through this analysis to develop algorithms for predicting market share winners, and their ultimate market shares, in the very early stages of the market. The utility of these predictive algorithms is illustrated through analysis of the experimental music market data sets [2].
The "weakest link" as an indicator of cognitive vulnerability differentially predicts symptom dimensions of anxiety in adolescents in China.

Science.gov (United States)

Wang, Junyi; Wang, Danyang; Cui, Lixia; McWhinnie, Chad M; Wang, Li; Xiao, Jing

2017-08-01

This multiwave longitudinal study examined the cognitive vulnerability-stress component of hopelessness theory to differentially predict symptom dimensions of anxiety using a "weakest link" approach in a sample of adolescents from Hunan Province, China. Baseline and 6-month follow-up data were obtained from 553 middle-school students. During an initial assessment, participants completed measures of assessing their weakest links, anxious symptoms, and the occurrence of stress. Participants subsequently completed measures assessing stress, and anxious symptoms one a month for six months. Higher weakest link scores were associated with greater increases in the harm avoidance and separation anxiety/panic dimensions, but not the physical or social anxiety dimension, of anxious symptoms following stress in Chinese adolescents. These results support the applicability of the "weakest link" approach, derived from hopelessness theory, in Chinese adolescents. Weakest link scores as cognitive vulnerability factors may play a role in the development of anxious symptoms, especially in the cognitive dimensions (e.g., harm avoidance and separation anxiety/panic). Our findings also have potential value in explaining the effectiveness of cognitive relevant therapy in treating the cognitive dimensions of anxious symptoms. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Deep Learning Algorithm for Prediction of Age-Related Eye Disease Study Severity Scale for Age-Related Macular Degeneration from Color Fundus Photography.

Science.gov (United States)

Grassmann, Felix; Mengelkamp, Judith; Brandl, Caroline; Harsch, Sebastian; Zimmermann, Martina E; Linkohr, Birgit; Peters, Annette; Heid, Iris M; Palm, Christoph; Weber, Bernhard H F

2018-04-10

Age-related macular degeneration (AMD) is a common threat to vision. While classification of disease stages is critical to understanding disease risk and progression, several systems based on color fundus photographs are known. Most of these require in-depth and time-consuming analysis of fundus images. Herein, we present an automated computer-based classification algorithm. Algorithm development for AMD classification based on a large collection of color fundus images. Validation is performed on a cross-sectional, population-based study. We included 120 656 manually graded color fundus images from 3654 Age-Related Eye Disease Study (AREDS) participants. AREDS participants were >55 years of age, and non-AMD sight-threatening diseases were excluded at recruitment. In addition, performance of our algorithm was evaluated in 5555 fundus images from the population-based Kooperative Gesundheitsforschung in der Region Augsburg (KORA; Cooperative Health Research in the Region of Augsburg) study. We defined 13 classes (9 AREDS steps, 3 late AMD stages, and 1 for ungradable images) and trained several convolution deep learning architectures. An ensemble of network architectures improved prediction accuracy. An independent dataset was used to evaluate the performance of our algorithm in a population-based study. κ Statistics and accuracy to evaluate the concordance between predicted and expert human grader classification. A network ensemble of 6 different neural net architectures predicted the 13 classes in the AREDS test set with a quadratic weighted κ of 92% (95% confidence interval, 89%-92%) and an overall accuracy of 63.3%. In the independent KORA dataset, images wrongly classified as AMD were mainly the result of a macular reflex observed in young individuals. By restricting the KORA analysis to individuals >55 years of age and prior exclusion of other retinopathies, the weighted and unweighted κ increased to 50% and 63%, respectively. Importantly, the algorithm
A General Mathematical Algorithm for Predicting the Course of Unfused Tetanic Contractions of Motor Units in Rat Muscle.

Directory of Open Access Journals (Sweden)

Rositsa Raikova

Full Text Available An unfused tetanus of a motor unit (MU evoked by a train of pulses at variable interpulse intervals is the sum of non-equal twitch-like responses to these stimuli. A tool for a precise prediction of these successive contractions for MUs of different physiological types with different contractile properties is crucial for modeling the whole muscle behavior during various types of activity. The aim of this paper is to develop such a general mathematical algorithm for the MUs of the medial gastrocnemius muscle of rats. For this purpose, tetanic curves recorded for 30 MUs (10 slow, 10 fast fatigue-resistant and 10 fast fatigable were mathematically decomposed into twitch-like contractions. Each contraction was modeled by the previously proposed 6-parameter analytical function, and the analysis of these six parameters allowed us to develop a prediction algorithm based on the following input data: parameters of the initial twitch, the maximum force of a MU and the series of pulses. Linear relationship was found between the normalized amplitudes of the successive contractions and the remainder between the actual force levels at which the contraction started and the maximum tetanic force. The normalization was made according to the amplitude of the first decomposed twitch. However, the respective approximation lines had different specific angles with respect to the ordinate. These angles had different and non-overlapping ranges for slow and fast MUs. A sensitivity analysis concerning this slope was performed and the dependence between the angles and the maximal fused tetanic force normalized to the amplitude of the first contraction was approximated by a power function. The normalized MU contraction and half-relaxation times were approximated by linear functions depending on the normalized actual force levels at which each contraction starts. The normalization was made according to the contraction time of the first contraction. The actual force levels
A hybrid Genetic Algorithm and Monte Carlo simulation approach to predict hourly energy consumption and generation by a cluster of Net Zero Energy Buildings

International Nuclear Information System (INIS)

Garshasbi, Samira; Kurnitski, Jarek; Mohammadi, Yousef

2016-01-01

Graphical abstract: The energy consumption and renewable generation in a cluster of NZEBs are modeled by a novel hybrid Genetic Algorithm and Monte Carlo simulation approach and used for the prediction of instantaneous and cumulative net energy balances and hourly amount of energy taken from and supplied to the central energy grid. - Highlights: • Hourly energy consumption and generation by a cluster of NZEBs was simulated. • Genetic Algorithm and Monte Carlo simulation approach were employed. • Dampening effect of energy used by a cluster of buildings was demonstrated. • Hourly amount of energy taken from and supplied to the grid was simulated. • Results showed that NZEB cluster was 63.5% grid dependant on annual bases. - Abstract: Employing a hybrid Genetic Algorithm (GA) and Monte Carlo (MC) simulation approach, energy consumption and renewable energy generation in a cluster of Net Zero Energy Buildings (NZEBs) was thoroughly investigated with hourly simulation. Moreover, the cumulative energy consumption and generation of the whole cluster and each individual building within the simulation space were accurately monitored and reported. The results indicate that the developed simulation algorithm is able to predict the total instantaneous and cumulative amount of energy taken from and supplied to the central energy grid over any time period. During the course of simulation, about 60–100% of total daily generated renewable energy was consumed by NZEBs and up to 40% of that was fed back into the central energy grid as surplus energy. The minimum grid dependency of the cluster was observed in June and July where 11.2% and 9.9% of the required electricity was supplied from the central energy grid, respectively. On the other hand, the NZEB cluster was strongly grid dependant in January and December by importing 70.7% and 76.1% of its required energy demand via the central energy grid, in the order given. Simulation results revealed that the cluster was 63
Linking spring phenology with mechanistic models of host movement to predict disease transmission risk

Science.gov (United States)

Merkle, Jerod A.; Cross, Paul C.; Scurlock, Brandon M.; Cole, Eric K.; Courtemanch, Alyson B.; Dewey, Sarah R.; Kauffman, Matthew J.

2018-01-01

Disease models typically focus on temporal dynamics of infection, while often neglecting environmental processes that determine host movement. In many systems, however, temporal disease dynamics may be slow compared to the scale at which environmental conditions alter host space-use and accelerate disease transmission.Using a mechanistic movement modelling approach, we made space-use predictions of a mobile host (elk [Cervus Canadensis] carrying the bacterial disease brucellosis) under environmental conditions that change daily and annually (e.g., plant phenology, snow depth), and we used these predictions to infer how spring phenology influences the risk of brucellosis transmission from elk (through aborted foetuses) to livestock in the Greater Yellowstone Ecosystem.Using data from 288 female elk monitored with GPS collars, we fit step selection functions (SSFs) during the spring abortion season and then implemented a master equation approach to translate SSFs into predictions of daily elk distribution for five plausible winter weather scenarios (from a heavy snow, to an extreme winter drought year). We predicted abortion events by combining elk distributions with empirical estimates of daily abortion rates, spatially varying elk seroprevelance and elk population counts.Our results reveal strong spatial variation in disease transmission risk at daily and annual scales that is strongly governed by variation in host movement in response to spring phenology. For example, in comparison with an average snow year, years with early snowmelt are predicted to have 64% of the abortions occurring on feedgrounds shift to occurring on mainly public lands, and to a lesser extent on private lands.Synthesis and applications. Linking mechanistic models of host movement with disease dynamics leads to a novel bridge between movement and disease ecology. Our analysis framework offers new avenues for predicting disease spread, while providing managers tools to proactively mitigate
Seismology for rockburst prediction.

CSIR Research Space (South Africa)

De Beer, W

2000-02-01

Full Text Available project GAP409 presents a method (SOOTHSAY) for predicting larger mining induced seismic events in gold mines, as well as a pattern recognition algorithm (INDICATOR) for characterising the seismic response of rock to mining and inferring future... State. Defining the time series of a specific function on a catalogue as a prediction strategy, the algorithm currently has a success rate of 53% and 65%, respectively, of large events claimed as being predicted in these two cases, with uncertainties...
Testing block subdivision algorithms on block designs

Science.gov (United States)

Wiseman, Natalie; Patterson, Zachary

2016-01-01

Integrated land use-transportation models predict future transportation demand taking into account how households and firms arrange themselves partly as a function of the transportation system. Recent integrated models require parcels as inputs and produce household and employment predictions at the parcel scale. Block subdivision algorithms automatically generate parcel patterns within blocks. Evaluating block subdivision algorithms is done by way of generating parcels and comparing them to those in a parcel database. Three block subdivision algorithms are evaluated on how closely they reproduce parcels of different block types found in a parcel database from Montreal, Canada. While the authors who developed each of the algorithms have evaluated them, they have used their own metrics and block types to evaluate their own algorithms. This makes it difficult to compare their strengths and weaknesses. The contribution of this paper is in resolving this difficulty with the aim of finding a better algorithm suited to subdividing each block type. The proposed hypothesis is that given the different approaches that block subdivision algorithms take, it's likely that different algorithms are better adapted to subdividing different block types. To test this, a standardized block type classification is used that consists of mutually exclusive and comprehensive categories. A statistical method is used for finding a better algorithm and the probability it will perform well for a given block type. Results suggest the oriented bounding box algorithm performs better for warped non-uniform sites, as well as gridiron and fragmented uniform sites. It also produces more similar parcel areas and widths. The Generalized Parcel Divider 1 algorithm performs better for gridiron non-uniform sites. The Straight Skeleton algorithm performs better for loop and lollipop networks as well as fragmented non-uniform and warped uniform sites. It also produces more similar parcel shapes and patterns.
Implementation of Freeman-Wimley prediction algorithm in a web-based application for in silico identification of beta-barrel membrane proteins

Directory of Open Access Journals (Sweden)

José Antonio Agüero-Fernández

2015-11-01

Full Text Available Beta-barrel type proteins play an important role in both, human and veterinary medicine. In particular, their localization on the bacterial surface, and their involvement in virulence mechanisms of pathogens, have turned them into an interesting target in studies to search for vaccine candidates. Recently, Freeman and Wimley developed a prediction algorithm based on the physicochemical properties of transmembrane beta-barrels proteins (TMBBs. Based on that algorithm, and using Grails, a web-based application was implemented. This system, named Beta Predictor, is capable of processing from one protein sequence to complete predicted proteomes up to 10000 proteins with a runtime of about 0.019 seconds per 500-residue protein, and it allows graphical analyses for each protein. The application was evaluated with a validation set of 535 non-redundant proteins, 102 TMBBs and 433 non-TMBBs. The sensitivity, specificity, Matthews correlation coefficient, positive predictive value and accuracy were calculated, being 85.29%, 95.15%, 78.72%, 80.56% and 93.27%, respectively. The performance of this system was compared with TMBBs predictors, BOMP and TMBHunt, using the same validation set. Taking into account the order mentioned above, the following results were obtained: 76.47%, 99.31%, 83.05%, 96.30% and 94.95% for BOMP, and 78.43%, 92.38%, 67.90%, 70.17% and 89.78% for TMBHunt. Beta Predictor was outperformed by BOMP but the latter showed better behavior than TMBHunt
Identify Beta-Hairpin Motifs with Quadratic Discriminant Algorithm Based on the Chemical Shifts.

Directory of Open Access Journals (Sweden)

Feng YongE

Full Text Available Successful prediction of the beta-hairpin motif will be helpful for understanding the of the fold recognition. Some algorithms have been proposed for the prediction of beta-hairpin motifs. However, the parameters used by these methods were primarily based on the amino acid sequences. Here, we proposed a novel model for predicting beta-hairpin structure based on the chemical shift. Firstly, we analyzed the statistical distribution of chemical shifts of six nuclei in not beta-hairpin and beta-hairpin motifs. Secondly, we used these chemical shifts as features combined with three algorithms to predict beta-hairpin structure. Finally, we achieved the best prediction, namely sensitivity of 92%, the specificity of 94% with 0.85 of Mathew's correlation coefficient using quadratic discriminant analysis algorithm, which is clearly superior to the same method for the prediction of beta-hairpin structure from 20 amino acid compositions in the three-fold cross-validation. Our finding showed that the chemical shift is an effective parameter for beta-hairpin prediction, suggesting the quadratic discriminant analysis is a powerful algorithm for the prediction of beta-hairpin.
An Improved Algorithm for Predicting Free Recalls

Science.gov (United States)

Laming, Donald

2008-01-01

Laming [Laming, D. (2006). "Predicting free recalls." "Journal of Experimental Psychology: Learning, Memory, and Cognition," 32, 1146-1163] has shown that, in a free-recall experiment in which the participants rehearsed out loud, entire sequences of recalls could be predicted, to a useful degree of precision, from the prior sequences of stimuli…
Hyperparameter Optimization of Artificial Neural Network in Customer Churn Prediction using Genetic Algorithm

Directory of Open Access Journals (Sweden)

Martin Fridrich

2017-06-01

Full Text Available Purpose of the article: The ability of the company to predict customer churn and retain customers is considered to be worthy competitive advantage since it improves cost allocation in customer retention programs, retaining future revenue and profits. In addition, it has several positive indirect impacts such as increasing customer’s loyalty. Therefore, the focus of the article is on building highly reliable and robust classification model, which deals with such a task. Methodology/methods: The analysis is carried out on labelled ecommerce retail dataset describing 10 000 most valuable customers with the highest CLV (Customer Lifetime Value. To obtain the best performing ANN (Artificial Neural Network classification model, proposed hyperparameter search space is explored with genetic algorithm to find suitable parameter settings. ANN classification performance is measured with regard to prediction ability, which is understood as point estimate of AUC (Area Under Curve mean on 4fold cross-validation set. Explored part of hyperparameter search space is analyzed with conditional inference tree structure addressing underlying fundamental context of given optimization which results in identification of critical factors leading to well performing ANN classification model. Scientific aim: To present and execute experimental design for performance evaluation and hyperparameter optimization of classification models, which are used for customer churn prediction. Findings: It is concluded and statistically proven that in experimental context described, regularization parameter as well as training function have significant influence on classifiers AUC performance contrasting other properties of ANN. More specifically, well performing ANN classification models have regularization parameter set to 0, adaptation function set to trainlm or trainscg and more than 100 training epochs. Global optimum is identified for solution with regularization parameter set to
Simplification of neural network model for predicting local power distributions of BWR fuel bundle using learning algorithm with forgetting

International Nuclear Information System (INIS)

Tanabe, Akira; Yamamoto, Toru; Shinfuku, Kimihiro; Nakamae, Takuji; Nishide, Fusayo.

1995-01-01

Previously a two-layered neural network model was developed to predict the relation between fissile enrichment of each fuel rod and local power distribution in a BWR fuel bundle. This model was obtained intuitively based on 33 patterns of training signals after an intensive survey of the models. Recently, a learning algorithm with forgetting was reported to simplify neural network models. It is an interesting subject what kind of model will be obtained if this algorithm is applied to the complex three-layered model which learns the same training signals. A three-layered model which is expanded to have direct connections between the 1st and the 3rd layer elements has been constructed and the learning method of normal back propagation was applied first to this model. The forgetting algorithm was then added to this learning process. The connections concerned with the 2nd layer elements disappeared and the 2nd layer has become unnecessary. It took a longer computing time by an order to learn the same training signals than the simple back propagation, but the two-layered model was obtained autonomously from the expanded three-layered model. (author)
Development of hybrid artificial intelligent based handover decision algorithm

Directory of Open Access Journals (Sweden)

A.M. Aibinu

2017-04-01

Full Text Available The possibility of seamless handover remains a mirage despite the plethora of existing handover algorithms. The underlying factor responsible for this has been traced to the Handover decision module in the Handover process. Hence, in this paper, the development of novel hybrid artificial intelligent handover decision algorithm has been developed. The developed model is made up of hybrid of Artificial Neural Network (ANN based prediction model and Fuzzy Logic. On accessing the network, the Received Signal Strength (RSS was acquired over a period of time to form a time series data. The data was then fed to the newly proposed k-step ahead ANN-based RSS prediction system for estimation of prediction model coefficients. The synaptic weights and adaptive coefficients of the trained ANN was then used to compute the k-step ahead ANN based RSS prediction model coefficients. The predicted RSS value was later codified as Fuzzy sets and in conjunction with other measured network parameters were fed into the Fuzzy logic controller in order to finalize handover decision process. The performance of the newly developed k-step ahead ANN based RSS prediction algorithm was evaluated using simulated and real data acquired from available mobile communication networks. Results obtained in both cases shows that the proposed algorithm is capable of predicting ahead the RSS value to about ±0.0002 dB. Also, the cascaded effect of the complete handover decision module was also evaluated. Results obtained show that the newly proposed hybrid approach was able to reduce ping-pong effect associated with other handover techniques.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.