WorldWideScience

Sample records for link prediction algorithms

  1. The Algorithm of Link Prediction on Social Network

    Directory of Open Access Journals (Sweden)

    Liyan Dong

    2013-01-01

    Full Text Available At present, most link prediction algorithms are based on the similarity between two entities. Social network topology information is one of the main sources to design the similarity function between entities. But the existing link prediction algorithms do not apply the network topology information sufficiently. For lack of traditional link prediction algorithms, we propose two improved algorithms: CNGF algorithm based on local information and KatzGF algorithm based on global information network. For the defect of the stationary of social network, we also provide the link prediction algorithm based on nodes multiple attributes information. Finally, we verified these algorithms on DBLP data set, and the experimental results show that the performance of the improved algorithm is superior to that of the traditional link prediction algorithm.

  2. Named Entity Linking Algorithm

    Directory of Open Access Journals (Sweden)

    M. F. Panteleev

    2017-01-01

    Full Text Available In the tasks of processing text in natural language, Named Entity Linking (NEL represents the task to define and link some entity, which is found in the text, with some entity in the knowledge base (for example, Dbpedia. Currently, there is a diversity of approaches to solve this problem, but two main classes can be identified: graph-based approaches and machine learning-based ones. Graph and Machine Learning approaches-based algorithm is proposed accordingly to the stated assumptions about the interrelations of named entities in a sentence and in general.In the case of graph-based approaches, it is necessary to solve the problem of identifying an optimal set of the related entities according to some metric that characterizes the distance between these entities in a graph built on some knowledge base. Due to limitations in processing power, to solve this task directly is impossible. Therefore, its modification is proposed. Based on the algorithms of machine learning, an independent solution cannot be built due to small volumes of training datasets relevant to NEL task. However, their use can contribute to improving the quality of the algorithm. The adaptation of the Latent Dirichlet Allocation model is proposed in order to obtain a measure of the compatibility of attributes of various entities encountered in one context.The efficiency of the proposed algorithm was experimentally tested. A test dataset was independently generated. On its basis the performance of the model was compared using the proposed algorithm with the open source product DBpedia Spotlight, which solves the NEL problem.The mockup, based on the proposed algorithm, showed a low speed as compared to DBpedia Spotlight. However, the fact that it has shown higher accuracy, stipulates the prospects for work in this direction.The main directions of development were proposed in order to increase the accuracy of the system and its productivity.

  3. Link prediction with node clustering coefficient

    Science.gov (United States)

    Wu, Zhihao; Lin, Youfang; Wang, Jing; Gregory, Steve

    2016-06-01

    Predicting missing links in incomplete complex networks efficiently and accurately is still a challenging problem. The recently proposed Cannistrai-Alanis-Ravai (CAR) index shows the power of local link/triangle information in improving link-prediction accuracy. Inspired by the idea of employing local link/triangle information, we propose a new similarity index with more local structure information. In our method, local link/triangle structure information can be conveyed by clustering coefficient of common-neighbors directly. The reason why clustering coefficient has good effectiveness in estimating the contribution of a common-neighbor is that it employs links existing between neighbors of a common-neighbor and these links have the same structural position with the candidate link to this common-neighbor. In our experiments, three estimators: precision, AUP and AUC are used to evaluate the accuracy of link prediction algorithms. Experimental results on ten tested networks drawn from various fields show that our new index is more effective in predicting missing links than CAR index, especially for networks with low correlation between number of common-neighbors and number of links between common-neighbors.

  4. A novel time series link prediction method: Learning automata approach

    Science.gov (United States)

    Moradabadi, Behnaz; Meybodi, Mohammad Reza

    2017-09-01

    Link prediction is a main social network challenge that uses the network structure to predict future links. The common link prediction approaches to predict hidden links use a static graph representation where a snapshot of the network is analyzed to find hidden or future links. For example, similarity metric based link predictions are a common traditional approach that calculates the similarity metric for each non-connected link and sort the links based on their similarity metrics and label the links with higher similarity scores as the future links. Because people activities in social networks are dynamic and uncertainty, and the structure of the networks changes over time, using deterministic graphs for modeling and analysis of the social network may not be appropriate. In the time-series link prediction problem, the time series link occurrences are used to predict the future links In this paper, we propose a new time series link prediction based on learning automata. In the proposed algorithm for each link that must be predicted there is one learning automaton and each learning automaton tries to predict the existence or non-existence of the corresponding link. To predict the link occurrence in time T, there is a chain consists of stages 1 through T - 1 and the learning automaton passes from these stages to learn the existence or non-existence of the corresponding link. Our preliminary link prediction experiments with co-authorship and email networks have provided satisfactory results when time series link occurrences are considered.

  5. Algorithms for Protein Structure Prediction

    DEFF Research Database (Denmark)

    Paluszewski, Martin

    -trace. Here we present three different approaches for reconstruction of C-traces from predictable measures. In our first approach [63, 62], the C-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE......) is more robust than standard Monte Carlo search. In the second approach for reconstruction of C-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower...... bounds for partial structures very fast. Using these lower bounds, we are able to find global minimum structures in a huge conformational space in reasonable time. We show that many of these global minimum structures are of good quality compared to the native structure. Our branch and bound algorithm...

  6. Link prediction in multiplex online social networks

    Science.gov (United States)

    Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

    2017-02-01

    Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.

  7. Link Label Prediction in Signed Citation Network

    KAUST Repository

    Akujuobi, Uchenna

    2016-04-12

    Link label prediction is the problem of predicting the missing labels or signs of all the unlabeled edges in a network. For signed networks, these labels can either be positive or negative. In recent years, different algorithms have been proposed such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other edges are known. However, in most real-world social graphs, the number of labeled edges is usually less than that of unlabeled edges. Therefore, predicting a single edge label at a time would require multiple runs and is more computationally demanding. In this thesis, we look at link label prediction problem on a signed citation network with missing edge labels. Our citation network consists of papers from three major machine learning and data mining conferences together with their references, and edges showing the relationship between them. An edge in our network is labeled either positive (dataset relevant) if the reference is based on the dataset used in the paper or negative otherwise. We present three approaches to predict the missing labels. The first approach converts the label prediction problem into a standard classification problem. We then, generate a set of features for each edge and then adopt Support Vector Machines in solving the classification problem. For the second approach, we formalize the graph such that the edges are represented as nodes with links showing similarities between them. We then adopt a label propagation method to propagate the labels on known nodes to those with unknown labels. In the third approach, we adopt a PageRank approach where we rank the nodes according to the number of incoming positive and negative edges, after which we set a threshold. Based on the ranks, we can infer an edge would be positive if it goes a node above the

  8. Meta-path based heterogeneous combat network link prediction

    Science.gov (United States)

    Li, Jichao; Ge, Bingfeng; Yang, Kewei; Chen, Yingwu; Tan, Yuejin

    2017-09-01

    The combat system-of-systems in high-tech informative warfare, composed of many interconnected combat systems of different types, can be regarded as a type of complex heterogeneous network. Link prediction for heterogeneous combat networks (HCNs) is of significant military value, as it facilitates reconfiguring combat networks to represent the complex real-world network topology as appropriate with observed information. This paper proposes a novel integrated methodology framework called HCNMP (HCN link prediction based on meta-path) to predict multiple types of links simultaneously for an HCN. More specifically, the concept of HCN meta-paths is introduced, through which the HCNMP can accumulate information by extracting different features of HCN links for all the six defined types. Next, an HCN link prediction model, based on meta-path features, is built to predict all types of links of the HCN simultaneously. Then, the solution algorithm for the HCN link prediction model is proposed, in which the prediction results are obtained by iteratively updating with the newly predicted results until the results in the HCN converge or reach a certain maximum iteration number. Finally, numerical experiments on the dataset of a real HCN are conducted to demonstrate the feasibility and effectiveness of the proposed HCNMP, in comparison with 30 baseline methods. The results show that the performance of the HCNMP is superior to those of the baseline methods.

  9. Congested Link Inference Algorithms in Dynamic Routing IP Network

    Directory of Open Access Journals (Sweden)

    Yu Chen

    2017-01-01

    Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.

  10. Predicting cryptic links in host-parasite networks.

    Directory of Open Access Journals (Sweden)

    Tad Dallas

    2017-05-01

    Full Text Available Networks are a way to represent interactions among one (e.g., social networks or more (e.g., plant-pollinator networks classes of nodes. The ability to predict likely, but unobserved, interactions has generated a great deal of interest, and is sometimes referred to as the link prediction problem. However, most studies of link prediction have focused on social networks, and have assumed a completely censused network. In biological networks, it is unlikely that all interactions are censused, and ignoring incomplete detection of interactions may lead to biased or incorrect conclusions. Previous attempts to predict network interactions have relied on known properties of network structure, making the approach sensitive to observation errors. This is an obvious shortcoming, as networks are dynamic, and sometimes not well sampled, leading to incomplete detection of links. Here, we develop an algorithm to predict missing links based on conditional probability estimation and associated, node-level features. We validate this algorithm on simulated data, and then apply it to a desert small mammal host-parasite network. Our approach achieves high accuracy on simulated and observed data, providing a simple method to accurately predict missing links in networks without relying on prior knowledge about network structure.

  11. Disrupting the Dissertation: Linked Data, Enhanced Publication and Algorithmic Culture

    Science.gov (United States)

    Tracy, Frances; Carmichael, Patrick

    2017-01-01

    This article explores how the three aspects of Striphas' notion of algorithmic culture (information, crowds and algorithms) might influence and potentially disrupt established educational practices. We draw on our experience of introducing semantic web and linked data technologies into higher education settings, focussing on extended student…

  12. An efficient link prediction index for complex military organization

    Science.gov (United States)

    Fan, Changjun; Liu, Zhong; Lu, Xin; Xiu, Baoxin; Chen, Qing

    2017-03-01

    Quality of information is crucial for decision-makers to judge the battlefield situations and design the best operation plans, however, real intelligence data are often incomplete and noisy, where missing links prediction methods and spurious links identification algorithms can be applied, if modeling the complex military organization as the complex network where nodes represent functional units and edges denote communication links. Traditional link prediction methods usually work well on homogeneous networks, but few for the heterogeneous ones. And the military network is a typical heterogeneous network, where there are different types of nodes and edges. In this paper, we proposed a combined link prediction index considering both the nodes' types effects and nodes' structural similarities, and demonstrated that it is remarkably superior to all the 25 existing similarity-based methods both in predicting missing links and identifying spurious links in a real military network data; we also investigated the algorithms' robustness under noisy environment, and found the mistaken information is more misleading than incomplete information in military areas, which is different from that in recommendation systems, and our method maintained the best performance under the condition of small noise. Since the real military network intelligence must be carefully checked at first due to its significance, and link prediction methods are just adopted to purify the network with the left latent noise, the method proposed here is applicable in real situations. In the end, as the FINC-E model, here used to describe the complex military organizations, is also suitable to many other social organizations, such as criminal networks, business organizations, etc., thus our method has its prospects in these areas for many tasks, like detecting the underground relationships between terrorists, predicting the potential business markets for decision-makers, and so on.

  13. An algorithm for link restoration of wavelength routing optical networks

    DEFF Research Database (Denmark)

    Limal, Emmanuel; Stubkjær, Kristian

    1999-01-01

    We present an algorithm for restoration of single link failure in wavelength routing multihop optical networks. The algorithm is based on an innovative study of networks using graph theory. It has the following original features: it (i) assigns working and spare channels simultaneously, (ii......) prevents the search for unacceptable routing paths by pointing out channels required for restoration, (iii) offers a high utilization of the capacity resources and (iv) allows a trivial search for the restoration paths. The algorithm is for link restoration of networks without wavelength translation. Its...

  14. Machine learning algorithms for datasets popularity prediction

    CERN Document Server

    Kancys, Kipras

    2016-01-01

    This report represents continued study where ML algorithms were used to predict databases popularity. Three topics were covered. First of all, there was a discrepancy between old and new meta-data collection procedures, so a reason for that had to be found. Secondly, different parameters were analysed and dropped to make algorithms perform better. And third, it was decided to move modelling part on Spark.

  15. Link Label Prediction in Signed Citation Network

    KAUST Repository

    Akujuobi, Uchenna Thankgod

    2016-01-01

    such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other

  16. Linking mothers and infants within electronic health records: a comparison of deterministic and probabilistic algorithms.

    Science.gov (United States)

    Baldwin, Eric; Johnson, Karin; Berthoud, Heidi; Dublin, Sascha

    2015-01-01

    To compare probabilistic and deterministic algorithms for linking mothers and infants within electronic health records (EHRs) to support pregnancy outcomes research. The study population was women enrolled in Group Health (Washington State, USA) delivering a liveborn infant from 2001 through 2008 (N = 33,093 deliveries) and infant members born in these years. We linked women to infants by surname, address, and dates of birth and delivery using deterministic and probabilistic algorithms. In a subset previously linked using "gold standard" identifiers (N = 14,449), we assessed each approach's sensitivity and positive predictive value (PPV). For deliveries with no "gold standard" linkage (N = 18,644), we compared the algorithms' linkage proportions. We repeated our analyses in an independent test set of deliveries from 2009 through 2013. We reviewed medical records to validate a sample of pairs apparently linked by one algorithm but not the other (N = 51 or 1.4% of discordant pairs). In the 2001-2008 "gold standard" population, the probabilistic algorithm's sensitivity was 84.1% (95% CI, 83.5-84.7) and PPV 99.3% (99.1-99.4), while the deterministic algorithm had sensitivity 74.5% (73.8-75.2) and PPV 95.7% (95.4-96.0). In the test set, the probabilistic algorithm again had higher sensitivity and PPV. For deliveries in 2001-2008 with no "gold standard" linkage, the probabilistic algorithm found matched infants for 58.3% and the deterministic algorithm, 52.8%. On medical record review, 100% of linked pairs appeared valid. A probabilistic algorithm improved linkage proportion and accuracy compared to a deterministic algorithm. Better linkage methods can increase the value of EHRs for pregnancy outcomes research. Copyright © 2014 John Wiley & Sons, Ltd.

  17. Link prediction via generalized coupled tensor factorisation

    DEFF Research Database (Denmark)

    Ermiş, Beyza; Evrim, Acar Ataman; Taylan Cemgil, A.

    2012-01-01

    and higher-order tensors. We propose to use an approach based on probabilistic interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor Factorisation, which can simultaneously fit a large class of tensor models to higher-order tensors/matrices with com- mon latent factors using...... different loss functions. Numerical experiments demonstrate that joint analysis of data from multiple sources via coupled factorisation improves the link prediction performance and the selection of right loss function and tensor model is crucial for accurately predicting missing links....

  18. Optimal design of link systems using successive zooming genetic algorithm

    Science.gov (United States)

    Kwon, Young-Doo; Sohn, Chang-hyun; Kwon, Soon-Bum; Lim, Jae-gyoo

    2009-07-01

    Link-systems have been around for a long time and are still used to control motion in diverse applications such as automobiles, robots and industrial machinery. This study presents a procedure involving the use of a genetic algorithm for the optimal design of single four-bar link systems and a double four-bar link system used in diesel engine. We adopted the Successive Zooming Genetic Algorithm (SZGA), which has one of the most rapid convergence rates among global search algorithms. The results are verified by experiment and the Recurdyn dynamic motion analysis package. During the optimal design of single four-bar link systems, we found in the case of identical input/output (IO) angles that the initial and final configurations show certain symmetry. For the double link system, we introduced weighting factors for the multi-objective functions, which minimize the difference between output angles, providing balanced engine performance, as well as the difference between final output angle and the desired magnitudes of final output angle. We adopted a graphical method to select a proper ratio between the weighting factors.

  19. Link Prediction in Evolving Networks Based on Popularity of Nodes.

    Science.gov (United States)

    Wang, Tong; He, Xing-Sheng; Zhou, Ming-Yang; Fu, Zhong-Qian

    2017-08-02

    Link prediction aims to uncover the underlying relationship behind networks, which could be utilized to predict missing edges or identify the spurious edges. The key issue of link prediction is to estimate the likelihood of potential links in networks. Most classical static-structure based methods ignore the temporal aspects of networks, limited by the time-varying features, such approaches perform poorly in evolving networks. In this paper, we propose a hypothesis that the ability of each node to attract links depends not only on its structural importance, but also on its current popularity (activeness), since active nodes have much more probability to attract future links. Then a novel approach named popularity based structural perturbation method (PBSPM) and its fast algorithm are proposed to characterize the likelihood of an edge from both existing connectivity structure and current popularity of its two endpoints. Experiments on six evolving networks show that the proposed methods outperform state-of-the-art methods in accuracy and robustness. Besides, visual results and statistical analysis reveal that the proposed methods are inclined to predict future edges between active nodes, rather than edges between inactive nodes.

  20. Hybrid Monte Carlo algorithm with fat link fermion actions

    International Nuclear Information System (INIS)

    Kamleh, Waseem; Leinweber, Derek B.; Williams, Anthony G.

    2004-01-01

    The use of APE smearing or other blocking techniques in lattice fermion actions can provide many advantages. There are many variants of these fat link actions in lattice QCD currently, such as flat link irrelevant clover (FLIC) fermions. The FLIC fermion formalism makes use of the APE blocking technique in combination with a projection of the blocked links back into the special unitary group. This reunitarization is often performed using an iterative maximization of a gauge invariant measure. This technique is not differentiable with respect to the gauge field and thus prevents the use of standard Hybrid Monte Carlo simulation algorithms. The use of an alternative projection technique circumvents this difficulty and allows the simulation of dynamical fat link fermions with standard HMC and its variants. The necessary equations of motion for FLIC fermions are derived, and some initial simulation results are presented. The technique is more general however, and is straightforwardly applicable to other smearing techniques or fat link actions

  1. Predicting Hidden Links in Supply Networks

    Directory of Open Access Journals (Sweden)

    A. Brintrup

    2018-01-01

    Full Text Available Manufacturing companies often lack visibility of the procurement interdependencies between the suppliers within their supply network. However, knowledge of these interdependencies is useful to plan for potential operational disruptions. In this paper, we develop the Supply Network Link Predictor (SNLP method to infer supplier interdependencies using the manufacturer’s incomplete knowledge of the network. SNLP uses topological data to extract relational features from the known network to train a classifier for predicting potential links. Using a test case from the automotive industry, four features are extracted: (i number of existing supplier links, (ii overlaps between supplier product portfolios, (iii product outsourcing associations, and (iv likelihood of buyers purchasing from two suppliers together. Naïve Bayes and Logistic Regression are then employed to predict whether these features can help predict interdependencies between two suppliers. Our results show that these features can indeed be used to predict interdependencies in the network and that predictive accuracy is maximised by (i and (iii. The findings give rise to the exciting possibility of using data analytics for improving supply chain visibility. We then proceed to discuss to what extent such approaches can be adopted and their limitations, highlighting next steps for future work in this area.

  2. Development of a Thermal Equilibrium Prediction Algorithm

    International Nuclear Information System (INIS)

    Aviles-Ramos, Cuauhtemoc

    2002-01-01

    A thermal equilibrium prediction algorithm is developed and tested using a heat conduction model and data sets from calorimetric measurements. The physical model used in this study is the exact solution of a system of two partial differential equations that govern the heat conduction in the calorimeter. A multi-parameter estimation technique is developed and implemented to estimate the effective volumetric heat generation and thermal diffusivity in the calorimeter measurement chamber, and the effective thermal diffusivity of the heat flux sensor. These effective properties and the exact solution are used to predict the heat flux sensor voltage readings at thermal equilibrium. Thermal equilibrium predictions are carried out considering only 20% of the total measurement time required for thermal equilibrium. A comparison of the predicted and experimental thermal equilibrium voltages shows that the average percentage error from 330 data sets is only 0.1%. The data sets used in this study come from calorimeters of different sizes that use different kinds of heat flux sensors. Furthermore, different nuclear material matrices were assayed in the process of generating these data sets. This study shows that the integration of this algorithm into the calorimeter data acquisition software will result in an 80% reduction of measurement time. This reduction results in a significant cutback in operational costs for the calorimetric assay of nuclear materials. (authors)

  3. A new mutually reinforcing network node and link ranking algorithm.

    Science.gov (United States)

    Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E

    2015-10-23

    This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity.

  4. A new mutually reinforcing network node and link ranking algorithm

    Science.gov (United States)

    Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E.

    2015-10-01

    This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity.

  5. A new mutually reinforcing network node and link ranking algorithm

    Science.gov (United States)

    Wang, Zhenghua; Dueñas-Osorio, Leonardo; Padgett, Jamie E.

    2015-01-01

    This study proposes a novel Normalized Wide network Ranking algorithm (NWRank) that has the advantage of ranking nodes and links of a network simultaneously. This algorithm combines the mutual reinforcement feature of Hypertext Induced Topic Selection (HITS) and the weight normalization feature of PageRank. Relative weights are assigned to links based on the degree of the adjacent neighbors and the Betweenness Centrality instead of assigning the same weight to every link as assumed in PageRank. Numerical experiment results show that NWRank performs consistently better than HITS, PageRank, eigenvector centrality, and edge betweenness from the perspective of network connectivity and approximate network flow, which is also supported by comparisons with the expensive N-1 benchmark removal criteria based on network efficiency. Furthermore, it can avoid some problems, such as the Tightly Knit Community effect, which exists in HITS. NWRank provides a new inexpensive way to rank nodes and links of a network, which has practical applications, particularly to prioritize resource allocation for upgrade of hierarchical and distributed networks, as well as to support decision making in the design of networks, where node and link importance depend on a balance of local and global integrity. PMID:26492958

  6. A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. 3: A stochastic rain fade control algorithm for satellite link power via non linear Markow filtering theory

    Science.gov (United States)

    Manning, Robert M.

    1991-01-01

    The dynamic and composite nature of propagation impairments that are incurred on Earth-space communications links at frequencies in and above 30/20 GHz Ka band, i.e., rain attenuation, cloud and/or clear air scintillation, etc., combined with the need to counter such degradations after the small link margins have been exceeded, necessitate the use of dynamic statistical identification and prediction processing of the fading signal in order to optimally estimate and predict the levels of each of the deleterious attenuation components. Such requirements are being met in NASA's Advanced Communications Technology Satellite (ACTS) Project by the implementation of optimal processing schemes derived through the use of the Rain Attenuation Prediction Model and nonlinear Markov filtering theory.

  7. Link prediction based on nonequilibrium cooperation effect

    Science.gov (United States)

    Li, Lanxi; Zhu, Xuzhen; Tian, Hui

    2018-04-01

    Link prediction in complex networks has become a common focus of many researchers. But most existing methods concentrate on neighbors, and rarely consider degree heterogeneity of two endpoints. Node degree represents the importance or status of endpoints. We describe the large-degree heterogeneity as the nonequilibrium between nodes. This nonequilibrium facilitates a stable cooperation between endpoints, so that two endpoints with large-degree heterogeneity tend to connect stably. We name such a phenomenon as the nonequilibrium cooperation effect. Therefore, this paper proposes a link prediction method based on the nonequilibrium cooperation effect to improve accuracy. Theoretical analysis will be processed in advance, and at the end, experiments will be performed in 12 real-world networks to compare the mainstream methods with our indices in the network through numerical analysis.

  8. Predicting mining activity with parallel genetic algorithms

    Science.gov (United States)

    Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,

    2005-01-01

    We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.

  9. Efficient predictive algorithms for image compression

    CERN Document Server

    Rosário Lucas, Luís Filipe; Maciel de Faria, Sérgio Manuel; Morais Rodrigues, Nuno Miguel; Liberal Pagliari, Carla

    2017-01-01

    This book discusses efficient prediction techniques for the current state-of-the-art High Efficiency Video Coding (HEVC) standard, focusing on the compression of a wide range of video signals, such as 3D video, Light Fields and natural images. The authors begin with a review of the state-of-the-art predictive coding methods and compression technologies for both 2D and 3D multimedia contents, which provides a good starting point for new researchers in the field of image and video compression. New prediction techniques that go beyond the standardized compression technologies are then presented and discussed. In the context of 3D video, the authors describe a new predictive algorithm for the compression of depth maps, which combines intra-directional prediction, with flexible block partitioning and linear residue fitting. New approaches are described for the compression of Light Field and still images, which enforce sparsity constraints on linear models. The Locally Linear Embedding-based prediction method is in...

  10. Nonlinear model predictive control theory and algorithms

    CERN Document Server

    Grüne, Lars

    2017-01-01

    This book offers readers a thorough and rigorous introduction to nonlinear model predictive control (NMPC) for discrete-time and sampled-data systems. NMPC schemes with and without stabilizing terminal constraints are detailed, and intuitive examples illustrate the performance of different NMPC variants. NMPC is interpreted as an approximation of infinite-horizon optimal control so that important properties like closed-loop stability, inverse optimality and suboptimality can be derived in a uniform manner. These results are complemented by discussions of feasibility and robustness. An introduction to nonlinear optimal control algorithms yields essential insights into how the nonlinear optimization routine—the core of any nonlinear model predictive controller—works. Accompanying software in MATLAB® and C++ (downloadable from extras.springer.com/), together with an explanatory appendix in the book itself, enables readers to perform computer experiments exploring the possibilities and limitations of NMPC. T...

  11. An auxiliary optimization method for complex public transit route network based on link prediction

    Science.gov (United States)

    Zhang, Lin; Lu, Jian; Yue, Xianfei; Zhou, Jialin; Li, Yunxuan; Wan, Qian

    2018-02-01

    Inspired by the missing (new) link prediction and the spurious existing link identification in link prediction theory, this paper establishes an auxiliary optimization method for public transit route network (PTRN) based on link prediction. First, link prediction applied to PTRN is described, and based on reviewing the previous studies, the summary indices set and its algorithms set are collected for the link prediction experiment. Second, through analyzing the topological properties of Jinan’s PTRN established by the Space R method, we found that this is a typical small-world network with a relatively large average clustering coefficient. This phenomenon indicates that the structural similarity-based link prediction will show a good performance in this network. Then, based on the link prediction experiment of the summary indices set, three indices with maximum accuracy are selected for auxiliary optimization of Jinan’s PTRN. Furthermore, these link prediction results show that the overall layout of Jinan’s PTRN is stable and orderly, except for a partial area that requires optimization and reconstruction. The above pattern conforms to the general pattern of the optimal development stage of PTRN in China. Finally, based on the missing (new) link prediction and the spurious existing link identification, we propose optimization schemes that can be used not only to optimize current PTRN but also to evaluate PTRN planning.

  12. Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks

    Science.gov (United States)

    Ma, Xiaoke; Sun, Penggang; Wang, Yu

    2018-04-01

    Many networks derived from society and nature are temporal and incomplete. The temporal link prediction problem in networks is to predict links at time T + 1 based on a given temporal network from time 1 to T, which is essential to important applications. The current algorithms either predict the temporal links by collapsing the dynamic networks or collapsing features derived from each network, which are criticized for ignoring the connection among slices. to overcome the issue, we propose a novel graph regularized nonnegative matrix factorization algorithm (GrNMF) for the temporal link prediction problem without collapsing the dynamic networks. To obtain the feature for each network from 1 to t, GrNMF factorizes the matrix associated with networks by setting the rest networks as regularization, which provides a better way to characterize the topological information of temporal links. Then, the GrNMF algorithm collapses the feature matrices to predict temporal links. Compared with state-of-the-art methods, the proposed algorithm exhibits significantly improved accuracy by avoiding the collapse of temporal networks. Experimental results of a number of artificial and real temporal networks illustrate that the proposed method is not only more accurate but also more robust than state-of-the-art approaches.

  13. Community detection, link prediction, and layer interdependence in multilayer networks

    Science.gov (United States)

    De Bacco, Caterina; Power, Eleanor A.; Larremore, Daniel B.; Moore, Cristopher

    2017-04-01

    Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization algorithm, which allows us to perform inference tasks such as community detection and link prediction in this setting. Our model assumes overlapping communities that are common between the layers, while allowing these communities to affect each layer in a different way, including arbitrary mixtures of assortative, disassortative, or directed structure. It also gives us a mathematically principled way to define the interdependence between layers, by measuring how much information about one layer helps us predict links in another layer. In particular, this allows us to bundle layers together to compress redundant information and identify small groups of layers which suffice to predict the remaining layers accurately. We illustrate these findings by analyzing synthetic data and two real multilayer networks, one representing social support relationships among villagers in South India and the other representing shared genetic substring material between genes of the malaria parasite.

  14. Improved hybrid optimization algorithm for 3D protein structure prediction.

    Science.gov (United States)

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  15. Link prediction boosted psychiatry disorder classification for functional connectivity network

    Science.gov (United States)

    Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang

    2017-02-01

    Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.

  16. Link adaptation algorithm for distributed coded transmissions in cooperative OFDMA systems

    DEFF Research Database (Denmark)

    Varga, Mihaly; Badiu, Mihai Alin; Bota, Vasile

    2015-01-01

    This paper proposes a link adaptation algorithm for cooperative transmissions in the down-link connection of an OFDMA-based wireless system. The algorithm aims at maximizing the spectral efficiency of a relay-aided communication link, while satisfying the block error rate constraints at both...... adaptation algorithm has linear complexity with the number of available resource blocks, while still provides a very good performance, as shown by simulation results....

  17. A class-based link prediction using Distance Dependent Chinese Restaurant Process

    Science.gov (United States)

    Andalib, Azam; Babamir, Seyed Morteza

    2016-08-01

    One of the important tasks in relational data analysis is link prediction which has been successfully applied on many applications such as bioinformatics, information retrieval, etc. The link prediction is defined as predicting the existence or absence of edges between nodes of a network. In this paper, we propose a novel method for link prediction based on Distance Dependent Chinese Restaurant Process (DDCRP) model which enables us to utilize the information of the topological structure of the network such as shortest path and connectivity of the nodes. We also propose a new Gibbs sampling algorithm for computing the posterior distribution of the hidden variables based on the training data. Experimental results on three real-world datasets show the superiority of the proposed method over other probabilistic models for link prediction problem.

  18. Link-Prediction Enhanced Consensus Clustering for Complex Networks (Open Access)

    Science.gov (United States)

    2016-05-20

    RESEARCH ARTICLE Link-Prediction Enhanced Consensus Clustering for Complex Networks Matthew Burgess1*, Eytan Adar1,2, Michael Cafarella1 1Computer...consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that...types of complex networks exhibit community structure: groups of highly connected nodes. Communities or clusters often reflect nodes that share similar

  19. An Improved Algorithm for Predicting Free Recalls

    Science.gov (United States)

    Laming, Donald

    2008-01-01

    Laming [Laming, D. (2006). "Predicting free recalls." "Journal of Experimental Psychology: Learning, Memory, and Cognition," 32, 1146-1163] has shown that, in a free-recall experiment in which the participants rehearsed out loud, entire sequences of recalls could be predicted, to a useful degree of precision, from the prior sequences of stimuli…

  20. MHC Class II epitope predictive algorithms

    DEFF Research Database (Denmark)

    Nielsen, Morten; Lund, Ole; Buus, S

    2010-01-01

    Major histocompatibility complex class II (MHC-II) molecules sample peptides from the extracellular space, allowing the immune system to detect the presence of foreign microbes from this compartment. To be able to predict the immune response to given pathogens, a number of methods have been...... developed to predict peptide-MHC binding. However, few methods other than the pioneering TEPITOPE/ProPred method have been developed for MHC-II. Despite recent progress in method development, the predictive performance for MHC-II remains significantly lower than what can be obtained for MHC-I. One reason...

  1. Appendix F. Developmental enforcement algorithm definition document : predictive braking enforcement algorithm definition document.

    Science.gov (United States)

    2012-05-01

    The purpose of this document is to fully define and describe the logic flow and mathematical equations for a predictive braking enforcement algorithm intended for implementation in a Positive Train Control (PTC) system.

  2. Predicting Students’ Performance using Modified ID3 Algorithm

    OpenAIRE

    Ramanathan L; Saksham Dhanda; Suresh Kumar D

    2013-01-01

    The ability to predict performance of students is very crucial in our present education system. We can use data mining concepts for this purpose. ID3 algorithm is one of the famous algorithms present today to generate decision trees. But this algorithm has a shortcoming that it is inclined to attributes with many values. So , this research aims to overcome this shortcoming of the algorithm by using gain ratio(instead of information gain) as well as by giving weights to each attribute at every...

  3. Dynamic Algorithm for LQGPC Predictive Control

    DEFF Research Database (Denmark)

    Hangstrup, M.; Ordys, A.W.; Grimble, M.J.

    1998-01-01

    In this paper the optimal control law is derived for a multi-variable state space Linear Quadratic Gaussian Predictive Controller (LQGPC). A dynamic performance index is utilized resulting in an optimal steady state controller. Knowledge of future reference values is incorporated into the control......In this paper the optimal control law is derived for a multi-variable state space Linear Quadratic Gaussian Predictive Controller (LQGPC). A dynamic performance index is utilized resulting in an optimal steady state controller. Knowledge of future reference values is incorporated...... into the controller design and the solution is derived using the method of Lagrange multipliers. It is shown how well-known GPC controller can be obtained as a special case of the LQGPC controller design. The important advantage of using the LQGPC framework for designing predictive, e.g. GPS is that LQGPC enables...

  4. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo.

    Science.gov (United States)

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-28

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.

  5. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo

    Science.gov (United States)

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-01

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.

  6. Computationally efficient model predictive control algorithms a neural network approach

    CERN Document Server

    Ławryńczuk, Maciej

    2014-01-01

    This book thoroughly discusses computationally efficient (suboptimal) Model Predictive Control (MPC) techniques based on neural models. The subjects treated include: ·         A few types of suboptimal MPC algorithms in which a linear approximation of the model or of the predicted trajectory is successively calculated on-line and used for prediction. ·         Implementation details of the MPC algorithms for feedforward perceptron neural models, neural Hammerstein models, neural Wiener models and state-space neural models. ·         The MPC algorithms based on neural multi-models (inspired by the idea of predictive control). ·         The MPC algorithms with neural approximation with no on-line linearization. ·         The MPC algorithms with guaranteed stability and robustness. ·         Cooperation between the MPC algorithms and set-point optimization. Thanks to linearization (or neural approximation), the presented suboptimal algorithms do not require d...

  7. Evaluating ortholog prediction algorithms in a yeast model clade.

    Directory of Open Access Journals (Sweden)

    Leonidas Salichos

    Full Text Available BACKGROUND: Accurate identification of orthologs is crucial for evolutionary studies and for functional annotation. Several algorithms have been developed for ortholog delineation, but so far, manually curated genome-scale biological databases of orthologous genes for algorithm evaluation have been lacking. We evaluated four popular ortholog prediction algorithms (MultiParanoid; and OrthoMCL; RBH: Reciprocal Best Hit; RSD: Reciprocal Smallest Distance; the last two extended into clustering algorithms cRBH and cRSD, respectively, so that they can predict orthologs across multiple taxa against a set of 2,723 groups of high-quality curated orthologs from 6 Saccharomycete yeasts in the Yeast Gene Order Browser. RESULTS: Examination of sensitivity [TP/(TP+FN], specificity [TN/(TN+FP], and accuracy [(TP+TN/(TP+TN+FP+FN] across a broad parameter range showed that cRBH was the most accurate and specific algorithm, whereas OrthoMCL was the most sensitive. Evaluation of the algorithms across a varying number of species showed that cRBH had the highest accuracy and lowest false discovery rate [FP/(FP+TP], followed by cRSD. Of the six species in our set, three descended from an ancestor that underwent whole genome duplication. Subsequent differential duplicate loss events in the three descendants resulted in distinct classes of gene loss patterns, including cases where the genes retained in the three descendants are paralogs, constituting 'traps' for ortholog prediction algorithms. We found that the false discovery rate of all algorithms dramatically increased in these traps. CONCLUSIONS: These results suggest that simple algorithms, like cRBH, may be better ortholog predictors than more complex ones (e.g., OrthoMCL and MultiParanoid for evolutionary and functional genomics studies where the objective is the accurate inference of single-copy orthologs (e.g., molecular phylogenetics, but that all algorithms fail to accurately predict orthologs when paralogy

  8. Improving local clustering based top-L link prediction methods via asymmetric link clustering information

    Science.gov (United States)

    Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan

    2018-02-01

    Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.

  9. Empirical algorithms to predict aragonite saturation state

    Science.gov (United States)

    Turk, Daniela; Dowd, Michael

    2017-04-01

    Novel sensor packages deployed on autonomous platforms (Profiling Floats, Gliders, Moorings, SeaCycler) and biogeochemical models have a potential to increase the coverage of a key water chemistry variable, aragonite saturation state (ΩAr) in time and space, in particular in the under sampled regions of global ocean. However, these do not provide the set of inorganic carbon measurements commonly used to derive ΩAr. There is therefore a need to develop regional predictive models to determine ΩAr from measurements of commonly observed or/and non carbonate oceanic variables. Here, we investigate predictive skill of several commonly observed oceanographic variables (temperature, salinity, oxygen, nitrate, phosphate and silicate) in determining ΩAr using climatology and shipboard data. This will allow us to assess potential for autonomous sensors and biogeochemical models to monitor ΩAr regionally and globally. We apply the regression models to several time series data sets and discuss regional differences and their implications for global estimates of ΩAr.

  10. Adaptive Outlier-tolerant Exponential Smoothing Prediction Algorithms with Applications to Predict the Temperature in Spacecraft

    OpenAIRE

    Hu Shaolin; Zhang Wei; Li Ye; Fan Shunxi

    2011-01-01

    The exponential smoothing prediction algorithm is widely used in spaceflight control and in process monitoring as well as in economical prediction. There are two key conundrums which are open: one is about the selective rule of the parameter in the exponential smoothing prediction, and the other is how to improve the bad influence of outliers on prediction. In this paper a new practical outlier-tolerant algorithm is built to select adaptively proper parameter, and the exponential smoothing pr...

  11. The wind power prediction research based on mind evolutionary algorithm

    Science.gov (United States)

    Zhuang, Ling; Zhao, Xinjian; Ji, Tianming; Miao, Jingwen; Cui, Haina

    2018-04-01

    When the wind power is connected to the power grid, its characteristics of fluctuation, intermittent and randomness will affect the stability of the power system. The wind power prediction can guarantee the power quality and reduce the operating cost of power system. There were some limitations in several traditional wind power prediction methods. On the basis, the wind power prediction method based on Mind Evolutionary Algorithm (MEA) is put forward and a prediction model is provided. The experimental results demonstrate that MEA performs efficiently in term of the wind power prediction. The MEA method has broad prospect of engineering application.

  12. Enhanced clinical pharmacy service targeting tools: risk-predictive algorithms.

    Science.gov (United States)

    El Hajji, Feras W D; Scullin, Claire; Scott, Michael G; McElnay, James C

    2015-04-01

    This study aimed to determine the value of using a mix of clinical pharmacy data and routine hospital admission spell data in the development of predictive algorithms. Exploration of risk factors in hospitalized patients, together with the targeting strategies devised, will enable the prioritization of clinical pharmacy services to optimize patient outcomes. Predictive algorithms were developed using a number of detailed steps using a 75% sample of integrated medicines management (IMM) patients, and validated using the remaining 25%. IMM patients receive targeted clinical pharmacy input throughout their hospital stay. The algorithms were applied to the validation sample, and predicted risk probability was generated for each patient from the coefficients. Risk threshold for the algorithms were determined by identifying the cut-off points of risk scores at which the algorithm would have the highest discriminative performance. Clinical pharmacy staffing levels were obtained from the pharmacy department staffing database. Numbers of previous emergency admissions and admission medicines together with age-adjusted co-morbidity and diuretic receipt formed a 12-month post-discharge and/or readmission risk algorithm. Age-adjusted co-morbidity proved to be the best index to predict mortality. Increased numbers of clinical pharmacy staff at ward level was correlated with a reduction in risk-adjusted mortality index (RAMI). Algorithms created were valid in predicting risk of in-hospital and post-discharge mortality and risk of hospital readmission 3, 6 and 12 months post-discharge. The provision of ward-based clinical pharmacy services is a key component to reducing RAMI and enabling the full benefits of pharmacy input to patient care to be realized. © 2014 John Wiley & Sons, Ltd.

  13. A Traffic Prediction Algorithm for Street Lighting Control Efficiency

    Directory of Open Access Journals (Sweden)

    POPA Valentin

    2013-01-01

    Full Text Available This paper presents the development of a traffic prediction algorithm that can be integrated in a street lighting monitoring and control system. The prediction algorithm must enable the reduction of energy costs and improve energy efficiency by decreasing the light intensity depending on the traffic level. The algorithm analyses and processes the information received at the command center based on the traffic level at different moments. The data is collected by means of the Doppler vehicle detection sensors integrated within the system. Thus, two methods are used for the implementation of the algorithm: a neural network and a k-NN (k-Nearest Neighbor prediction algorithm. For 500 training cycles, the mean square error of the neural network is 9.766 and for 500.000 training cycles the error amounts to 0.877. In case of the k-NN algorithm the error increases from 8.24 for k=5 to 12.27 for a number of 50 neighbors. In terms of a root means square error parameter, the use of a neural network ensures the highest performance level and can be integrated in a street lighting control system.

  14. Optimisation of the link volume for weakest link failure prediction in NBG-18 nuclear graphite

    International Nuclear Information System (INIS)

    Hindley, Michael P.; Groenwold, Albert A.; Blaine, Deborah C.; Becker, Thorsten H.

    2014-01-01

    This paper describes the process for approximating the optimal size of a link volume required for weakest link failure calculation in nuclear graphite, with NBG-18 used as an example. As part of the failure methodology, the link volume is defined in terms of two grouping criteria. The first criterion is a factor of the maximum grain size and the second criterion is a function of an equivalent stress limit. A methodology for approximating these grouping criteria is presented. The failure methodology employs finite element analysis (FEA) in order to predict the failure load, at 50% probability of failure. The average experimental failure load, as determined for 26 test geometries, is used to evaluate the accuracy of the weakest link failure calculations. The influence of the two grouping criteria on the failure load prediction is evaluated by defining an error in prediction across all test cases. Mathematical optimisation is used to find the minimum error across a range of test case failure predictions. This minimum error is shown to deliver the most accurate failure prediction across a whole range of components, although some test cases in the range predict conservative failure load. The mathematical optimisation objective function is penalised to account for non-conservative prediction of the failure load for any test case. The optimisation is repeated and a link volume found for conservative failure prediction. The failure prediction for each test case is evaluated, in detail, for the proposed link volumes. Based on the analysis, link design volumes for NBG-18 are recommended for either accurate or conservative failure prediction

  15. Evaluating and comparing algorithms for respiratory motion prediction

    International Nuclear Information System (INIS)

    Ernst, F; Dürichen, R; Schlaefer, A; Schweikard, A

    2013-01-01

    In robotic radiosurgery, it is necessary to compensate for systematic latencies arising from target tracking and mechanical constraints. This compensation is usually achieved by means of an algorithm which computes the future target position. In most scientific works on respiratory motion prediction, only one or two algorithms are evaluated on a limited amount of very short motion traces. The purpose of this work is to gain more insight into the real world capabilities of respiratory motion prediction methods by evaluating many algorithms on an unprecedented amount of data. We have evaluated six algorithms, the normalized least mean squares (nLMS), recursive least squares (RLS), multi-step linear methods (MULIN), wavelet-based multiscale autoregression (wLMS), extended Kalman filtering, and ε-support vector regression (SVRpred) methods, on an extensive database of 304 respiratory motion traces. The traces were collected during treatment with the CyberKnife (Accuray, Inc., Sunnyvale, CA, USA) and feature an average length of 71 min. Evaluation was done using a graphical prediction toolkit, which is available to the general public, as is the data we used. The experiments show that the nLMS algorithm—which is one of the algorithms currently used in the CyberKnife—is outperformed by all other methods. This is especially true in the case of the wLMS, the SVRpred, and the MULIN algorithms, which perform much better. The nLMS algorithm produces a relative root mean square (RMS) error of 75% or less (i.e., a reduction in error of 25% or more when compared to not doing prediction) in only 38% of the test cases, whereas the MULIN and SVRpred methods reach this level in more than 77%, the wLMS algorithm in more than 84% of the test cases. Our work shows that the wLMS algorithm is the most accurate algorithm and does not require parameter tuning, making it an ideal candidate for clinical implementation. Additionally, we have seen that the structure of a patient

  16. Efficient network disintegration under incomplete information: the comic effect of link prediction

    Science.gov (United States)

    Tan, Suo-Yi; Wu, Jun; Lü, Linyuan; Li, Meng-Jun; Lu, Xin

    2016-01-01

    The study of network disintegration has attracted much attention due to its wide applications, including suppressing the epidemic spreading, destabilizing terrorist network, preventing financial contagion, controlling the rumor diffusion and perturbing cancer networks. The crux of this matter is to find the critical nodes whose removal will lead to network collapse. This paper studies the disintegration of networks with incomplete link information. An effective method is proposed to find the critical nodes by the assistance of link prediction techniques. Extensive experiments in both synthetic and real networks suggest that, by using link prediction method to recover partial missing links in advance, the method can largely improve the network disintegration performance. Besides, to our surprise, we find that when the size of missing information is relatively small, our method even outperforms than the results based on complete information. We refer to this phenomenon as the “comic effect” of link prediction, which means that the network is reshaped through the addition of some links that identified by link prediction algorithms, and the reshaped network is like an exaggerated but characteristic comic of the original one, where the important parts are emphasized. PMID:26960247

  17. Efficient network disintegration under incomplete information: the comic effect of link prediction

    Science.gov (United States)

    Tan, Suo-Yi; Wu, Jun; Lü, Linyuan; Li, Meng-Jun; Lu, Xin

    2016-03-01

    The study of network disintegration has attracted much attention due to its wide applications, including suppressing the epidemic spreading, destabilizing terrorist network, preventing financial contagion, controlling the rumor diffusion and perturbing cancer networks. The crux of this matter is to find the critical nodes whose removal will lead to network collapse. This paper studies the disintegration of networks with incomplete link information. An effective method is proposed to find the critical nodes by the assistance of link prediction techniques. Extensive experiments in both synthetic and real networks suggest that, by using link prediction method to recover partial missing links in advance, the method can largely improve the network disintegration performance. Besides, to our surprise, we find that when the size of missing information is relatively small, our method even outperforms than the results based on complete information. We refer to this phenomenon as the “comic effect” of link prediction, which means that the network is reshaped through the addition of some links that identified by link prediction algorithms, and the reshaped network is like an exaggerated but characteristic comic of the original one, where the important parts are emphasized.

  18. An algorithm for the diagnosis of X-linked intellectual disability in children

    Directory of Open Access Journals (Sweden)

    V. Yu. Voinova

    2016-01-01

    Full Text Available X-linked intellectual disability (XLID is a clinically and genetically heterogeneous group of hereditary diseases caused by mutations on the X chromosome, which lead to impaired intellectual development. The paper determines for the first time the proportion of X-linked diseases (6.54% in the pattern of intellectual disability in children. A system has been developed to quantify the clinical severity of fragile X mental retardation syndrome and Rett syndrome. A system has been scientifically justified to predict the clinical severity, which is based on an analysis of the impact of genetic and epigenetic factors (mutation type and location, X chromosome inactivation. The authors have determined the contribution of nonrandom X inactivation to the clinical polymorphism of various forms of XLID and established its role as an important diagnostic marker for pathology. It is shown that the study of X chromosome inactivation can identify asymptomatic female carriers of X-linked mutations to provide medical genetic counseling to families. An algorithm has been elaborated to diagnose XLID among the undifferentiated forms of mental developmental abnormalities in children. 

  19. The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion.

    Science.gov (United States)

    Borkowski, Piotr

    2017-06-20

    It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship's current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships.

  20. A recurrence-weighted prediction algorithm for musical analysis

    Science.gov (United States)

    Colucci, Renato; Leguizamon Cucunuba, Juan Sebastián; Lloyd, Simon

    2018-03-01

    Forecasting the future behaviour of a system using past data is an important topic. In this article we apply nonlinear time series analysis in the context of music, and present new algorithms for extending a sample of music, while maintaining characteristics similar to the original piece. By using ideas from ergodic theory, we adapt the classical prediction method of Lorenz analogues so as to take into account recurrence times, and demonstrate with examples, how the new algorithm can produce predictions with a high degree of similarity to the original sample.

  1. The Ship Movement Trajectory Prediction Algorithm Using Navigational Data Fusion

    Directory of Open Access Journals (Sweden)

    Piotr Borkowski

    2017-06-01

    Full Text Available It is essential for the marine navigator conducting maneuvers of his ship at sea to know future positions of himself and target ships in a specific time span to effectively solve collision situations. This article presents an algorithm of ship movement trajectory prediction, which, through data fusion, takes into account measurements of the ship’s current position from a number of doubled autonomous devices. This increases the reliability and accuracy of prediction. The algorithm has been implemented in NAVDEC, a navigation decision support system and practically used on board ships.

  2. Real coded genetic algorithm for fuzzy time series prediction

    Science.gov (United States)

    Jain, Shilpa; Bisht, Dinesh C. S.; Singh, Phool; Mathpal, Prakash C.

    2017-10-01

    Genetic Algorithm (GA) forms a subset of evolutionary computing, rapidly growing area of Artificial Intelligence (A.I.). Some variants of GA are binary GA, real GA, messy GA, micro GA, saw tooth GA, differential evolution GA. This research article presents a real coded GA for predicting enrollments of University of Alabama. Data of Alabama University is a fuzzy time series. Here, fuzzy logic is used to predict enrollments of Alabama University and genetic algorithm optimizes fuzzy intervals. Results are compared to other eminent author works and found satisfactory, and states that real coded GA are fast and accurate.

  3. An algorithm to discover gene signatures with predictive potential

    Directory of Open Access Journals (Sweden)

    Hallett Robin M

    2010-09-01

    Full Text Available Abstract Background The advent of global gene expression profiling has generated unprecedented insight into our molecular understanding of cancer, including breast cancer. For example, human breast cancer patients display significant diversity in terms of their survival, recurrence, metastasis as well as response to treatment. These patient outcomes can be predicted by the transcriptional programs of their individual breast tumors. Predictive gene signatures allow us to correctly classify human breast tumors into various risk groups as well as to more accurately target therapy to ensure more durable cancer treatment. Results Here we present a novel algorithm to generate gene signatures with predictive potential. The method first classifies the expression intensity for each gene as determined by global gene expression profiling as low, average or high. The matrix containing the classified data for each gene is then used to score the expression of each gene based its individual ability to predict the patient characteristic of interest. Finally, all examined genes are ranked based on their predictive ability and the most highly ranked genes are included in the master gene signature, which is then ready for use as a predictor. This method was used to accurately predict the survival outcomes in a cohort of human breast cancer patients. Conclusions We confirmed the capacity of our algorithm to generate gene signatures with bona fide predictive ability. The simplicity of our algorithm will enable biological researchers to quickly generate valuable gene signatures without specialized software or extensive bioinformatics training.

  4. Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.

    Science.gov (United States)

    Petrotchenko, Evgeniy V; Borchers, Christoph H

    2014-09-01

    Cross-linking combined with MS involves enzymatic digestion of cross-linked proteins and identifying cross-linked peptides. Assignment of cross-linked peptide masses requires a search of all possible binary combinations of peptides from the cross-linked proteins' sequences, which becomes impractical with increasing complexity of the protein system and/or if digestion enzyme specificity is relaxed. Here, we describe the application of a fast sorting algorithm to search large sequence databases for cross-linked peptide assignments based on mass. This same algorithm has been used previously for assigning disulfide-bridged peptides (Choi et al., ), but has not previously been applied to cross-linking studies. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Which clustering algorithm is better for predicting protein complexes?

    Directory of Open Access Journals (Sweden)

    Moschopoulos Charalampos N

    2011-12-01

    Full Text Available Abstract Background Protein-Protein interactions (PPI play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H and Tandem Affinity Purification (TAP methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm

  6. A range-based predictive localization algorithm for WSID networks

    Science.gov (United States)

    Liu, Yuan; Chen, Junjie; Li, Gang

    2017-11-01

    Most studies on localization algorithms are conducted on the sensor networks with densely distributed nodes. However, the non-localizable problems are prone to occur in the network with sparsely distributed sensor nodes. To solve this problem, a range-based predictive localization algorithm (RPLA) is proposed in this paper for the wireless sensor networks syncretizing the RFID (WSID) networks. The Gaussian mixture model is established to predict the trajectory of a mobile target. Then, the received signal strength indication is used to reduce the residence area of the target location based on the approximate point-in-triangulation test algorithm. In addition, collaborative localization schemes are introduced to locate the target in the non-localizable situations. Simulation results verify that the RPLA achieves accurate localization for the network with sparsely distributed sensor nodes. The localization accuracy of the RPLA is 48.7% higher than that of the APIT algorithm, 16.8% higher than that of the single Gaussian model-based algorithm and 10.5% higher than that of the Kalman filtering-based algorithm.

  7. Fuzzy model predictive control algorithm applied in nuclear power plant

    International Nuclear Information System (INIS)

    Zuheir, Ahmad

    2006-01-01

    The aim of this paper is to design a predictive controller based on a fuzzy model. The Takagi-Sugeno fuzzy model with an Adaptive B-splines neuro-fuzzy implementation is used and incorporated as a predictor in a predictive controller. An optimization approach with a simplified gradient technique is used to calculate predictions of the future control actions. In this approach, adaptation of the fuzzy model using dynamic process information is carried out to build the predictive controller. The easy description of the fuzzy model and the easy computation of the gradient sector during the optimization procedure are the main advantages of the computation algorithm. The algorithm is applied to the control of a U-tube steam generation unit (UTSG) used for electricity generation. (author)

  8. Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms

    DEFF Research Database (Denmark)

    Nielsen, Henrik

    2015-01-01

    was used. Various statistical and machine learning algorithms are used with all three approaches, and various measures and standards are employed when reporting the performances of the developed methods. This chapter presents a number of available methods for prediction of sorting signals and subcellular...

  9. A link prediction method for heterogeneous networks based on BP neural network

    Science.gov (United States)

    Li, Ji-chao; Zhao, Dan-ling; Ge, Bing-Feng; Yang, Ke-Wei; Chen, Ying-Wu

    2018-04-01

    Most real-world systems, composed of different types of objects connected via many interconnections, can be abstracted as various complex heterogeneous networks. Link prediction for heterogeneous networks is of great significance for mining missing links and reconfiguring networks according to observed information, with considerable applications in, for example, friend and location recommendations and disease-gene candidate detection. In this paper, we put forward a novel integrated framework, called MPBP (Meta-Path feature-based BP neural network model), to predict multiple types of links for heterogeneous networks. More specifically, the concept of meta-path is introduced, followed by the extraction of meta-path features for heterogeneous networks. Next, based on the extracted meta-path features, a supervised link prediction model is built with a three-layer BP neural network. Then, the solution algorithm of the proposed link prediction model is put forward to obtain predicted results by iteratively training the network. Last, numerical experiments on the dataset of examples of a gene-disease network and a combat network are conducted to verify the effectiveness and feasibility of the proposed MPBP. It shows that the MPBP with very good performance is superior to the baseline methods.

  10. A comprehensive comparison of network similarities for link prediction and spurious link elimination

    Science.gov (United States)

    Zhang, Peng; Qiu, Dan; Zeng, An; Xiao, Jinghua

    2018-06-01

    Identifying missing interactions in complex networks, known as link prediction, is realized by estimating the likelihood of the existence of a link between two nodes according to the observed links and nodes' attributes. Similar approaches have also been employed to identify and remove spurious links in networks which is crucial for improving the reliability of network data. In network science, the likelihood for two nodes having a connection strongly depends on their structural similarity. The key to address these two problems thus becomes how to objectively measure the similarity between nodes in networks. In the literature, numerous network similarity metrics have been proposed and their accuracy has been discussed independently in previous works. In this paper, we systematically compare the accuracy of 18 similarity metrics in both link prediction and spurious link elimination when the observed networks are very sparse or consist of inaccurate linking information. Interestingly, some methods have high prediction accuracy, they tend to perform low accuracy in identification spurious interaction. We further find that methods can be classified into several cluster according to their behaviors. This work is useful for guiding future use of these similarity metrics for different purposes.

  11. An improved shuffled frog leaping algorithm based evolutionary framework for currency exchange rate prediction

    Science.gov (United States)

    Dash, Rajashree

    2017-11-01

    Forecasting purchasing power of one currency with respect to another currency is always an interesting topic in the field of financial time series prediction. Despite the existence of several traditional and computational models for currency exchange rate forecasting, there is always a need for developing simpler and more efficient model, which will produce better prediction capability. In this paper, an evolutionary framework is proposed by using an improved shuffled frog leaping (ISFL) algorithm with a computationally efficient functional link artificial neural network (CEFLANN) for prediction of currency exchange rate. The model is validated by observing the monthly prediction measures obtained for three currency exchange data sets such as USD/CAD, USD/CHF, and USD/JPY accumulated within same period of time. The model performance is also compared with two other evolutionary learning techniques such as Shuffled frog leaping algorithm and Particle Swarm optimization algorithm. Practical analysis of results suggest that, the proposed model developed using the ISFL algorithm with CEFLANN network is a promising predictor model for currency exchange rate prediction compared to other models included in the study.

  12. Accurate Prediction of Coronary Artery Disease Using Bioinformatics Algorithms

    Directory of Open Access Journals (Sweden)

    Hajar Shafiee

    2016-06-01

    Full Text Available Background and Objectives: Cardiovascular disease is one of the main causes of death in developed and Third World countries. According to the statement of the World Health Organization, it is predicted that death due to heart disease will rise to 23 million by 2030. According to the latest statistics reported by Iran’s Minister of health, 3.39% of all deaths are attributed to cardiovascular diseases and 19.5% are related to myocardial infarction. The aim of this study was to predict coronary artery disease using data mining algorithms. Methods: In this study, various bioinformatics algorithms, such as decision trees, neural networks, support vector machines, clustering, etc., were used to predict coronary heart disease. The data used in this study was taken from several valid databases (including 14 data. Results: In this research, data mining techniques can be effectively used to diagnose different diseases, including coronary artery disease. Also, for the first time, a prediction system based on support vector machine with the best possible accuracy was introduced. Conclusion: The results showed that among the features, thallium scan variable is the most important feature in the diagnosis of heart disease. Designation of machine prediction models, such as support vector machine learning algorithm can differentiate between sick and healthy individuals with 100% accuracy.

  13. Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis

    Directory of Open Access Journals (Sweden)

    Charles Frank

    2018-03-01

    Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.

  14. Fast prediction of RNA-RNA interaction using heuristic algorithm.

    Science.gov (United States)

    Montaseri, Soheila

    2015-01-01

    Interaction between two RNA molecules plays a crucial role in many medical and biological processes such as gene expression regulation. In this process, an RNA molecule prohibits the translation of another RNA molecule by establishing stable interactions with it. Some algorithms have been formed to predict the structure of the RNA-RNA interaction. High computational time is a common challenge in most of the presented algorithms. In this context, a heuristic method is introduced to accurately predict the interaction between two RNAs based on minimum free energy (MFE). This algorithm uses a few dot matrices for finding the secondary structure of each RNA and binding sites between two RNAs. Furthermore, a parallel version of this method is presented. We describe the algorithm's concurrency and parallelism for a multicore chip. The proposed algorithm has been performed on some datasets including CopA-CopT, R1inv-R2inv, Tar-Tar*, DIS-DIS, and IncRNA54-RepZ in Escherichia coli bacteria. The method has high validity and efficiency, and it is run in low computational time in comparison to other approaches.

  15. Link Prediction via Convex Nonnegative Matrix Factorization on Multiscale Blocks

    Directory of Open Access Journals (Sweden)

    Enming Dong

    2014-01-01

    Full Text Available Low rank matrices approximations have been used in link prediction for networks, which are usually global optimal methods and lack of using the local information. The block structure is a significant local feature of matrices: entities in the same block have similar values, which implies that links are more likely to be found within dense blocks. We use this insight to give a probabilistic latent variable model for finding missing links by convex nonnegative matrix factorization with block detection. The experiments show that this method gives better prediction accuracy than original method alone. Different from the original low rank matrices approximations methods for link prediction, the sparseness of solutions is in accord with the sparse property for most real complex networks. Scaling to massive size network, we use the block information mapping matrices onto distributed architectures and give a divide-and-conquer prediction method. The experiments show that it gives better results than common neighbors method when the networks have a large number of missing links.

  16. Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks

    International Nuclear Information System (INIS)

    Daminelli, Simone; Thomas, Josephine Maria; Durán, Claudio; Vittorio Cannistraci, Carlo

    2015-01-01

    Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but not within the two classes. Unveiling physical principles, building theories and suggesting physical models to predict bipartite links such as product-consumer connections in recommendation systems or drug–target interactions in molecular networks can provide priceless information to improve e-commerce or to accelerate pharmaceutical research. The prediction of nonobserved connections starting from those already present in the topology of a network is known as the link-prediction problem. It represents an important subject both in many-body interaction theory in physics and in new algorithms for applied tools in computer science. The rationale is that the existing connectivity structure of a network can suggest where new connections can appear with higher likelihood in an evolving network, or where nonobserved connections are missing in a partially known network. Surprisingly, current complex network theory presents a theoretical bottle-neck: a general framework for local-based link prediction directly in the bipartite domain is missing. Here, we overcome this theoretical obstacle and present a formal definition of common neighbour index and local-community-paradigm (LCP) for bipartite networks. As a consequence, we are able to introduce the first node-neighbourhood-based and LCP-based models for topological link prediction that utilize the bipartite domain. We performed link prediction evaluations in several networks of different size and of disparate origin, including technological, social and biological systems. Our models significantly improve topological prediction in many bipartite networks because they exploit local physical driving-forces that participate in the formation and organization of many real-world bipartite networks. Furthermore, we present a local-based formalism that allows to intuitively

  17. Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks

    Science.gov (United States)

    Daminelli, Simone; Thomas, Josephine Maria; Durán, Claudio; Vittorio Cannistraci, Carlo

    2015-11-01

    Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but not within the two classes. Unveiling physical principles, building theories and suggesting physical models to predict bipartite links such as product-consumer connections in recommendation systems or drug-target interactions in molecular networks can provide priceless information to improve e-commerce or to accelerate pharmaceutical research. The prediction of nonobserved connections starting from those already present in the topology of a network is known as the link-prediction problem. It represents an important subject both in many-body interaction theory in physics and in new algorithms for applied tools in computer science. The rationale is that the existing connectivity structure of a network can suggest where new connections can appear with higher likelihood in an evolving network, or where nonobserved connections are missing in a partially known network. Surprisingly, current complex network theory presents a theoretical bottle-neck: a general framework for local-based link prediction directly in the bipartite domain is missing. Here, we overcome this theoretical obstacle and present a formal definition of common neighbour index and local-community-paradigm (LCP) for bipartite networks. As a consequence, we are able to introduce the first node-neighbourhood-based and LCP-based models for topological link prediction that utilize the bipartite domain. We performed link prediction evaluations in several networks of different size and of disparate origin, including technological, social and biological systems. Our models significantly improve topological prediction in many bipartite networks because they exploit local physical driving-forces that participate in the formation and organization of many real-world bipartite networks. Furthermore, we present a local-based formalism that allows to intuitively

  18. Prediction of Baseflow Index of Catchments using Machine Learning Algorithms

    Science.gov (United States)

    Yadav, B.; Hatfield, K.

    2017-12-01

    We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized

  19. Predicting Coastal Flood Severity using Random Forest Algorithm

    Science.gov (United States)

    Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.

    2017-12-01

    Coastal floods have become more common recently and are predicted to further increase in frequency and severity due to sea level rise. Predicting floods in coastal cities can be difficult due to the number of environmental and geographic factors which can influence flooding events. Built stormwater infrastructure and irregular urban landscapes add further complexity. This paper demonstrates the use of machine learning algorithms in predicting street flood occurrence in an urban coastal setting. The model is trained and evaluated using data from Norfolk, Virginia USA from September 2010 - October 2016. Rainfall, tide levels, water table levels, and wind conditions are used as input variables. Street flooding reports made by city workers after named and unnamed storm events, ranging from 1-159 reports per event, are the model output. Results show that Random Forest provides predictive power in estimating the number of flood occurrences given a set of environmental conditions with an out-of-bag root mean squared error of 4.3 flood reports and a mean absolute error of 0.82 flood reports. The Random Forest algorithm performed much better than Poisson regression. From the Random Forest model, total daily rainfall was by far the most important factor in flood occurrence prediction, followed by daily low tide and daily higher high tide. The model demonstrated here could be used to predict flood severity based on forecast rainfall and tide conditions and could be further enhanced using more complete street flooding data for model training.

  20. Predicting the growth of new links by new preferential attachment ...

    Indian Academy of Sciences (India)

    2014-03-07

    Mar 7, 2014 ... ... Science and Engineering, Central University of Finance and Economics, ... nism of network evolution, and also for predicting the growth of new links, without .... the high voltage transmission lines between them. ..... 6104104, 11147121 and 61104143), the Scientific Research Fund of Education Depart-.

  1. Comparison of genetic algorithm and imperialist competitive algorithms in predicting bed load transport in clean pipe.

    Science.gov (United States)

    Ebtehaj, Isa; Bonakdari, Hossein

    2014-01-01

    The existence of sediments in wastewater greatly affects the performance of the sewer and wastewater transmission systems. Increased sedimentation in wastewater collection systems causes problems such as reduced transmission capacity and early combined sewer overflow. The article reviews the performance of the genetic algorithm (GA) and imperialist competitive algorithm (ICA) in minimizing the target function (mean square error of observed and predicted Froude number). To study the impact of bed load transport parameters, using four non-dimensional groups, six different models have been presented. Moreover, the roulette wheel selection method is used to select the parents. The ICA with root mean square error (RMSE) = 0.007, mean absolute percentage error (MAPE) = 3.5% show better results than GA (RMSE = 0.007, MAPE = 5.6%) for the selected model. All six models return better results than the GA. Also, the results of these two algorithms were compared with multi-layer perceptron and existing equations.

  2. Link Prediction in Social Networks: the State-of-the-Art

    OpenAIRE

    Wang, Peng; Xu, Baowen; Wu, Yurong; Zhou, Xiaoyu

    2014-01-01

    In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems ...

  3. Predicting disease risk using bootstrap ranking and classification algorithms.

    Directory of Open Access Journals (Sweden)

    Ohad Manor

    Full Text Available Genome-wide association studies (GWAS are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a "black box" in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF, suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment.

  4. A causal link between prediction errors, dopamine neurons and learning.

    Science.gov (United States)

    Steinberg, Elizabeth E; Keiflin, Ronald; Boivin, Josiah R; Witten, Ilana B; Deisseroth, Karl; Janak, Patricia H

    2013-07-01

    Situations in which rewards are unexpectedly obtained or withheld represent opportunities for new learning. Often, this learning includes identifying cues that predict reward availability. Unexpected rewards strongly activate midbrain dopamine neurons. This phasic signal is proposed to support learning about antecedent cues by signaling discrepancies between actual and expected outcomes, termed a reward prediction error. However, it is unknown whether dopamine neuron prediction error signaling and cue-reward learning are causally linked. To test this hypothesis, we manipulated dopamine neuron activity in rats in two behavioral procedures, associative blocking and extinction, that illustrate the essential function of prediction errors in learning. We observed that optogenetic activation of dopamine neurons concurrent with reward delivery, mimicking a prediction error, was sufficient to cause long-lasting increases in cue-elicited reward-seeking behavior. Our findings establish a causal role for temporally precise dopamine neuron signaling in cue-reward learning, bridging a critical gap between experimental evidence and influential theoretical frameworks.

  5. Predictive Power Estimation Algorithm (PPEA--a new algorithm to reduce overfitting for genomic biomarker discovery.

    Directory of Open Access Journals (Sweden)

    Jiangang Liu

    Full Text Available Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA, which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1 PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2 the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3 using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4 more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses.

  6. The alliance relationship analysis of international terrorist organizations with link prediction

    Science.gov (United States)

    Fang, Ling; Fang, Haiyang; Tian, Yanfang; Yang, Tinghong; Zhao, Jing

    2017-09-01

    Terrorism is a huge public hazard of the international community. Alliances of terrorist organizations may cause more serious threat to national security and world peace. Understanding alliances between global terrorist organizations will facilitate more effective anti-terrorism collaboration between governments. Based on publicly available data, this study constructed a alliance network between terrorist organizations and analyzed the alliance relationships with link prediction. We proposed a novel index based on optimal weighted fusion of six similarity indices, in which the optimal weight is calculated by genetic algorithm. Our experimental results showed that this algorithm could achieve better results on the networks than other algorithms. Using this method, we successfully digged out 21 real terrorist organizations alliance from current data. Our experiment shows that this approach used for terrorist organizations alliance mining is effective and this study is expected to benefit the form of a more powerful anti-terrorism strategy.

  7. Incorporating functional inter-relationships into protein function prediction algorithms

    Directory of Open Access Journals (Sweden)

    Kumar Vipin

    2009-05-01

    Full Text Available Abstract Background Functional classification schemes (e.g. the Gene Ontology that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. Results We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the k-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of

  8. Inverse kinematics algorithm for a six-link manipulator using a polynomial expression

    International Nuclear Information System (INIS)

    Sasaki, Shinobu

    1987-01-01

    This report is concerned with the forward and inverse kinematics problem relevant to a six-link robot manipulator. In order to derive the kinematic relationships between links, the vector rotation operator was applied instead of the conventional homogeneous transformation. The exact algorithm for solving the inverse problem was obtained by transforming kinematics equations into a polynomial. As shown in test calculations, the accuracies of numerical solutions obtained by means of the present approach are found to be quite high. The algorithm proposed permits to find out all feasible solutions for the given inverse problem. (author)

  9. Chaos Time Series Prediction Based on Membrane Optimization Algorithms

    Directory of Open Access Journals (Sweden)

    Meng Li

    2015-01-01

    Full Text Available This paper puts forward a prediction model based on membrane computing optimization algorithm for chaos time series; the model optimizes simultaneously the parameters of phase space reconstruction (τ,m and least squares support vector machine (LS-SVM (γ,σ by using membrane computing optimization algorithm. It is an important basis for spectrum management to predict accurately the change trend of parameters in the electromagnetic environment, which can help decision makers to adopt an optimal action. Then, the model presented in this paper is used to forecast band occupancy rate of frequency modulation (FM broadcasting band and interphone band. To show the applicability and superiority of the proposed model, this paper will compare the forecast model presented in it with conventional similar models. The experimental results show that whether single-step prediction or multistep prediction, the proposed model performs best based on three error measures, namely, normalized mean square error (NMSE, root mean square error (RMSE, and mean absolute percentage error (MAPE.

  10. Shape: automatic conformation prediction of carbohydrates using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Rosen Jimmy

    2009-09-01

    Full Text Available Abstract Background Detailed experimental three dimensional structures of carbohydrates are often difficult to acquire. Molecular modelling and computational conformation prediction are therefore commonly used tools for three dimensional structure studies. Modelling procedures generally require significant training and computing resources, which is often impractical for most experimental chemists and biologists. Shape has been developed to improve the availability of modelling in this field. Results The Shape software package has been developed for simplicity of use and conformation prediction performance. A trivial user interface coupled to an efficient genetic algorithm conformation search makes it a powerful tool for automated modelling. Carbohydrates up to a few hundred atoms in size can be investigated on common computer hardware. It has been shown to perform well for the prediction of over four hundred bioactive oligosaccharides, as well as compare favourably with previously published studies on carbohydrate conformation prediction. Conclusion The Shape fully automated conformation prediction can be used by scientists who lack significant modelling training, and performs well on computing hardware such as laptops and desktops. It can also be deployed on computer clusters for increased capacity. The prediction accuracy under the default settings is good, as it agrees well with experimental data and previously published conformation prediction studies. This software is available both as open source and under commercial licenses.

  11. Prediction models and control algorithms for predictive applications of setback temperature in cooling systems

    International Nuclear Information System (INIS)

    Moon, Jin Woo; Yoon, Younju; Jeon, Young-Hoon; Kim, Sooyoung

    2017-01-01

    Highlights: • Initial ANN model was developed for predicting the time to the setback temperature. • Initial model was optimized for producing accurate output. • Optimized model proved its prediction accuracy. • ANN-based algorithms were developed and tested their performance. • ANN-based algorithms presented superior thermal comfort or energy efficiency. - Abstract: In this study, a temperature control algorithm was developed to apply a setback temperature predictively for the cooling system of a residential building during occupied periods by residents. An artificial neural network (ANN) model was developed to determine the required time for increasing the current indoor temperature to the setback temperature. This study involved three phases: development of the initial ANN-based prediction model, optimization and testing of the initial model, and development and testing of three control algorithms. The development and performance testing of the model and algorithm were conducted using TRNSYS and MATLAB. Through the development and optimization process, the final ANN model employed indoor temperature and the temperature difference between the current and target setback temperature as two input neurons. The optimal number of hidden layers, number of neurons, learning rate, and moment were determined to be 4, 9, 0.6, and 0.9, respectively. The tangent–sigmoid and pure-linear transfer function was used in the hidden and output neurons, respectively. The ANN model used 100 training data sets with sliding-window method for data management. Levenberg-Marquart training method was employed for model training. The optimized model had a prediction accuracy of 0.9097 root mean square errors when compared with the simulated results. Employing the ANN model, ANN-based algorithms maintained indoor temperatures better within target ranges. Compared to the conventional algorithm, the ANN-based algorithms reduced the duration of time, in which the indoor temperature

  12. Algorithms for assessing person-based consistency among linked records for the investigation of maternal use of medications and safety

    Directory of Open Access Journals (Sweden)

    Duong Tran

    2017-04-01

    Quality assessment indicated high consistency among linked records. The set of algorithms developed in this project can be applied to similar linked perinatal datasets to promote a consistent approach and comparability across studies.

  13. The integration of weighted human gene association networks based on link prediction.

    Science.gov (United States)

    Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing

    2017-01-31

    Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.

  14. BPP: a sequence-based algorithm for branch point prediction.

    Science.gov (United States)

    Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing

    2017-10-15

    Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. A novel multilayer model for missing link prediction and future link forecasting in dynamic complex networks

    Science.gov (United States)

    Yasami, Yasser; Safaei, Farshad

    2018-02-01

    The traditional complex network theory is particularly focused on network models in which all network constituents are dealt with equivalently, while fail to consider the supplementary information related to the dynamic properties of the network interactions. This is a main constraint leading to incorrect descriptions of some real-world phenomena or incomplete capturing the details of certain real-life problems. To cope with the problem, this paper addresses the multilayer aspects of dynamic complex networks by analyzing the properties of intrinsically multilayered co-authorship networks, DBLP and Astro Physics, and presenting a novel multilayer model of dynamic complex networks. The model examines the layers evolution (layers birth/death process and lifetime) throughout the network evolution. Particularly, this paper models the evolution of each node's membership in different layers by an Infinite Factorial Hidden Markov Model considering feature cascade, and thereby formulates the link generation process for intra-layer and inter-layer links. Although adjacency matrixes are useful to describe the traditional single-layer networks, such a representation is not sufficient to describe and analyze the multilayer dynamic networks. This paper also extends a generalized mathematical infrastructure to address the problems issued by multilayer complex networks. The model inference is performed using some Markov Chain Monte Carlo sampling strategies, given synthetic and real complex networks data. Experimental results indicate a tremendous improvement in the performance of the proposed multilayer model in terms of sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios, F1-score, Matthews correlation coefficient, and accuracy for two important applications of missing link prediction and future link forecasting. The experimental results also indicate the strong predictivepower of the proposed model for the application of

  16. A probabilistic fragment-based protein structure prediction algorithm.

    Directory of Open Access Journals (Sweden)

    David Simoncini

    Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].

  17. ANNIT - An Efficient Inversion Algorithm based on Prediction Principles

    Science.gov (United States)

    Růžek, B.; Kolář, P.

    2009-04-01

    Solution of inverse problems represents meaningful job in geophysics. The amount of data is continuously increasing, methods of modeling are being improved and the computer facilities are also advancing great technical progress. Therefore the development of new and efficient algorithms and computer codes for both forward and inverse modeling is still up to date. ANNIT is contributing to this stream since it is a tool for efficient solution of a set of non-linear equations. Typical geophysical problems are based on parametric approach. The system is characterized by a vector of parameters p, the response of the system is characterized by a vector of data d. The forward problem is usually represented by unique mapping F(p)=d. The inverse problem is much more complex and the inverse mapping p=G(d) is available in an analytical or closed form only exceptionally and generally it may not exist at all. Technically, both forward and inverse mapping F and G are sets of non-linear equations. ANNIT solves such situation as follows: (i) joint subspaces {pD, pM} of original data and model spaces D, M, resp. are searched for, within which the forward mapping F is sufficiently smooth that the inverse mapping G does exist, (ii) numerical approximation of G in subspaces {pD, pM} is found, (iii) candidate solution is predicted by using this numerical approximation. ANNIT is working in an iterative way in cycles. The subspaces {pD, pM} are searched for by generating suitable populations of individuals (models) covering data and model spaces. The approximation of the inverse mapping is made by using three methods: (a) linear regression, (b) Radial Basis Function Network technique, (c) linear prediction (also known as "Kriging"). The ANNIT algorithm has built in also an archive of already evaluated models. Archive models are re-used in a suitable way and thus the number of forward evaluations is minimized. ANNIT is now implemented both in MATLAB and SCILAB. Numerical tests show good

  18. RAINLINK: Retrieval algorithm for rainfall monitoring employing microwave links from a cellular communication network

    Science.gov (United States)

    Uijlenhoet, R.; Overeem, A.; Leijnse, H.; Rios Gaona, M. F.

    2017-12-01

    The basic principle of rainfall estimation using microwave links is as follows. Rainfall attenuates the electromagnetic signals transmitted from one telephone tower to another. By measuring the received power at one end of a microwave link as a function of time, the path-integrated attenuation due to rainfall can be calculated, which can be converted to average rainfall intensities over the length of a link. Microwave links from cellular communication networks have been proposed as a promising new rainfall measurement technique for one decade. They are particularly interesting for those countries where few surface rainfall observations are available. Yet to date no operational (real-time) link-based rainfall products are available. To advance the process towards operational application and upscaling of this technique, there is a need for freely available, user-friendly computer code for microwave link data processing and rainfall mapping. Such software is now available as R package "RAINLINK" on GitHub (https://github.com/overeem11/RAINLINK). It contains a working example to compute link-based 15-min rainfall maps for the entire surface area of The Netherlands for 40 hours from real microwave link data. This is a working example using actual data from an extensive network of commercial microwave links, for the first time, which will allow users to test their own algorithms and compare their results with ours. The package consists of modular functions, which facilitates running only part of the algorithm. The main processings steps are: 1) Preprocessing of link data (initial quality and consistency checks); 2) Wet-dry classification using link data; 3) Reference signal determination; 4) Removal of outliers ; 5) Correction of received signal powers; 6) Computation of mean path-averaged rainfall intensities; 7) Interpolation of rainfall intensities ; 8) Rainfall map visualisation. Some applications of RAINLINK will be shown based on microwave link data from a

  19. Novel Intermode Prediction Algorithm for High Efficiency Video Coding Encoder

    Directory of Open Access Journals (Sweden)

    Chan-seob Park

    2014-01-01

    Full Text Available The joint collaborative team on video coding (JCT-VC is developing the next-generation video coding standard which is called high efficiency video coding (HEVC. In the HEVC, there are three units in block structure: coding unit (CU, prediction unit (PU, and transform unit (TU. The CU is the basic unit of region splitting like macroblock (MB. Each CU performs recursive splitting into four blocks with equal size, starting from the tree block. In this paper, we propose a fast CU depth decision algorithm for HEVC technology to reduce its computational complexity. In 2N×2N PU, the proposed method compares the rate-distortion (RD cost and determines the depth using the compared information. Moreover, in order to speed up the encoding time, the efficient merge SKIP detection method is developed additionally based on the contextual mode information of neighboring CUs. Experimental result shows that the proposed algorithm achieves the average time-saving factor of 44.84% in the random access (RA at Main profile configuration with the HEVC test model (HM 10.0 reference software. Compared to HM 10.0 encoder, a small BD-bitrate loss of 0.17% is also observed without significant loss of image quality.

  20. Reliable prediction of adsorption isotherms via genetic algorithm molecular simulation.

    Science.gov (United States)

    LoftiKatooli, L; Shahsavand, A

    2017-01-01

    Conventional molecular simulation techniques such as grand canonical Monte Carlo (GCMC) strictly rely on purely random search inside the simulation box for predicting the adsorption isotherms. This blind search is usually extremely time demanding for providing a faithful approximation of the real isotherm and in some cases may lead to non-optimal solutions. A novel approach is presented in this article which does not use any of the classical steps of the standard GCMC method, such as displacement, insertation, and removal. The new approach is based on the well-known genetic algorithm to find the optimal configuration for adsorption of any adsorbate on a structured adsorbent under prevailing pressure and temperature. The proposed approach considers the molecular simulation problem as a global optimization challenge. A detailed flow chart of our so-called genetic algorithm molecular simulation (GAMS) method is presented, which is entirely different from traditions molecular simulation approaches. Three real case studies (for adsorption of CO 2 and H 2 over various zeolites) are borrowed from literature to clearly illustrate the superior performances of the proposed method over the standard GCMC technique. For the present method, the average absolute values of percentage errors are around 11% (RHO-H 2 ), 5% (CHA-CO 2 ), and 16% (BEA-CO 2 ), while they were about 70%, 15%, and 40% for the standard GCMC technique, respectively.

  1. Ternary alloy material prediction using genetic algorithm and cluster expansion

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Chong [Iowa State Univ., Ames, IA (United States)

    2015-12-01

    This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we did our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe3VSi2 is a new stable phase and it can be very inspiring to the future experiments.

  2. Fast Quantum Algorithm for Predicting Descriptive Statistics of Stochastic Processes

    Science.gov (United States)

    Williams Colin P.

    1999-01-01

    Stochastic processes are used as a modeling tool in several sub-fields of physics, biology, and finance. Analytic understanding of the long term behavior of such processes is only tractable for very simple types of stochastic processes such as Markovian processes. However, in real world applications more complex stochastic processes often arise. In physics, the complicating factor might be nonlinearities; in biology it might be memory effects; and in finance is might be the non-random intentional behavior of participants in a market. In the absence of analytic insight, one is forced to understand these more complex stochastic processes via numerical simulation techniques. In this paper we present a quantum algorithm for performing such simulations. In particular, we show how a quantum algorithm can predict arbitrary descriptive statistics (moments) of N-step stochastic processes in just O(square root of N) time. That is, the quantum complexity is the square root of the classical complexity for performing such simulations. This is a significant speedup in comparison to the current state of the art.

  3. Developing robust arsenic awareness prediction models using machine learning algorithms.

    Science.gov (United States)

    Singh, Sushant K; Taylor, Robert W; Rahman, Mohammad Mahmudur; Pradhan, Biswajeet

    2018-04-01

    Arsenic awareness plays a vital role in ensuring the sustainability of arsenic mitigation technologies. Thus far, however, few studies have dealt with the sustainability of such technologies and its associated socioeconomic dimensions. As a result, arsenic awareness prediction has not yet been fully conceptualized. Accordingly, this study evaluated arsenic awareness among arsenic-affected communities in rural India, using a structured questionnaire to record socioeconomic, demographic, and other sociobehavioral factors with an eye to assessing their association with and influence on arsenic awareness. First a logistic regression model was applied and its results compared with those produced by six state-of-the-art machine-learning algorithms (Support Vector Machine [SVM], Kernel-SVM, Decision Tree [DT], k-Nearest Neighbor [k-NN], Naïve Bayes [NB], and Random Forests [RF]) as measured by their accuracy at predicting arsenic awareness. Most (63%) of the surveyed population was found to be arsenic-aware. Significant arsenic awareness predictors were divided into three types: (1) socioeconomic factors: caste, education level, and occupation; (2) water and sanitation behavior factors: number of family members involved in water collection, distance traveled and time spent for water collection, places for defecation, and materials used for handwashing after defecation; and (3) social capital and trust factors: presence of anganwadi and people's trust in other community members, NGOs, and private agencies. Moreover, individuals' having higher social network positively contributed to arsenic awareness in the communities. Results indicated that both the SVM and the RF algorithms outperformed at overall prediction of arsenic awareness-a nonlinear classification problem. Lower-caste, less educated, and unemployed members of the population were found to be the most vulnerable, requiring immediate arsenic mitigation. To this end, local social institutions and NGOs could play a

  4. Predictive algorithms for early detection of retinopathy of prematurity.

    Science.gov (United States)

    Piermarocchi, Stefano; Bini, Silvia; Martini, Ferdinando; Berton, Marianna; Lavini, Anna; Gusson, Elena; Marchini, Giorgio; Padovani, Ezio Maria; Macor, Sara; Pignatto, Silvia; Lanzetta, Paolo; Cattarossi, Luigi; Baraldi, Eugenio; Lago, Paola

    2017-03-01

    To evaluate sensitivity, specificity and the safest cut-offs of three predictive algorithms (WINROP, ROPScore and CHOP ROP) for retinopathy of prematurity (ROP). A retrospective study was conducted in three centres from 2012 to 2014; 445 preterms with gestational age (GA) ≤ 30 weeks and/or birthweight (BW) ≤ 1500 g, and additional unstable cases, were included. No-ROP, mild and type 1 ROP were categorized. The algorithms were analysed for infants with all parameters (GA, BW, weight gain, oxygen therapy, blood transfusion) needed for calculation (399 babies). Retinopathy of prematurity (ROP) was identified in both eyes in 116 patients (26.1%), and 44 (9.9%) had type 1 ROP. Gestational age and BW were significantly lower in ROP group compared with no-ROP subjects (GA: 26.7 ± 2.2 and 30.2 ± 1.9, respectively, p < 0.0001; BW: 839.8 ± 287.0 and 1288.1 ± 321.5 g, respectively, p = 0.0016). Customized alarms of ROPScore and CHOP ROP correctly identified all infants having any ROP or type 1 ROP. WINROP missed 19 cases of ROP, including three type 1 ROP. ROPScore and CHOP ROP provided the best performances with an area under the receiver operating characteristic curve for the detection of severe ROP of 0.93 (95% CI, 0.90-0.96, and 95% CI, 0.89-0.96, respectively), and WINROP obtained 0.83 (95% CI, 0.77-0.87). Median time from alarm to treatment was 11.1, 5.1 and 9.1 weeks, for WINROP, ROPScore and CHOP ROP, respectively. ROPScore and CHOP ROP showed 100% sensitivity to identify sight-threatening ROP. Predictive algorithms are a reliable tool for early identification of infants requiring referral to an ophthalmologist, for reorganizing resources and reducing stressful procedures to preterm babies. © 2016 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.

  5. A Framing Link Based Tabu Search Algorithm for Large-Scale Multidepot Vehicle Routing Problems

    Directory of Open Access Journals (Sweden)

    Xuhao Zhang

    2014-01-01

    Full Text Available A framing link (FL based tabu search algorithm is proposed in this paper for a large-scale multidepot vehicle routing problem (LSMDVRP. Framing links are generated during continuous great optimization of current solutions and then taken as skeletons so as to improve optimal seeking ability, speed up the process of optimization, and obtain better results. Based on the comparison between pre- and postmutation routes in the current solution, different parts are extracted. In the current optimization period, links involved in the optimal solution are regarded as candidates to the FL base. Multiple optimization periods exist in the whole algorithm, and there are several potential FLs in each period. If the update condition is satisfied, the FL base is updated, new FLs are added into the current route, and the next period starts. Through adjusting the borderline of multidepot sharing area with dynamic parameters, the authors define candidate selection principles for three kinds of customer connections, respectively. Link split and the roulette approach are employed to choose FLs. 18 LSMDVRP instances in three groups are studied and new optimal solution values for nine of them are obtained, with higher computation speed and reliability.

  6. A conservative fully implicit algorithm for predicting slug flows

    Science.gov (United States)

    Krasnopolsky, Boris I.; Lukyanov, Alexander A.

    2018-02-01

    An accurate and predictive modelling of slug flows is required by many industries (e.g., oil and gas, nuclear engineering, chemical engineering) to prevent undesired events potentially leading to serious environmental accidents. For example, the hydrodynamic and terrain-induced slugging leads to unwanted unsteady flow conditions. This demands the development of fast and robust numerical techniques for predicting slug flows. The presented in this paper study proposes a multi-fluid model and its implementation method accounting for phase appearance and disappearance. The numerical modelling of phase appearance and disappearance presents a complex numerical challenge for all multi-component and multi-fluid models. Numerical challenges arise from the singular systems of equations when some phases are absent and from the solution discontinuity when some phases appear or disappear. This paper provides a flexible and robust solution to these issues. A fully implicit formulation described in this work enables to efficiently solve governing fluid flow equations. The proposed numerical method provides a modelling capability of phase appearance and disappearance processes, which is based on switching procedure between various sets of governing equations. These sets of equations are constructed using information about the number of phases present in the computational domain. The proposed scheme does not require an explicit truncation of solutions leading to a conservative scheme for mass and linear momentum. A transient two-fluid model is used to verify and validate the proposed algorithm for conditions of hydrodynamic and terrain-induced slug flow regimes. The developed modelling capabilities allow to predict all the major features of the experimental data, and are in a good quantitative agreement with them.

  7. Predicting the Retention Behavior of Specific O-Linked Glycopeptides.

    Science.gov (United States)

    Badgett, Majors J; Boyes, Barry; Orlando, Ron

    2017-09-01

    O -Linked glycosylation is a common post-translational modification that can alter the overall structure, polarity, and function of proteins. Reverse-phase (RP) chromatography is the most common chromatographic approach to analyze O -glycosylated peptides and their unmodified counterparts, even though this approach often does not provide adequate separation of these two species. Hydrophilic interaction liquid chromatography (HILIC) can be a solution to this problem, as the polar glycan interacts with the polar stationary phase and potentially offers the ability to resolve the peptide from its modified form(s). In this paper, HILIC is used to separate peptides with O - N -acetylgalactosamine ( O -GalNAc), O - N -acetylglucosamine ( O -GlcNAc), and O -fucose additions from their native forms, and coefficients representing the extent of hydrophilicity were derived using linear regression analysis as a means to predict the retention times of peptides with these modifications.

  8. Verification and improvement of predictive algorithms for radionuclide migration

    International Nuclear Information System (INIS)

    Carnahan, C.L.; Miller, C.W.; Remer, J.S.

    1984-01-01

    This research addresses issues relevant to numerical simulation and prediction of migration of radionuclides in the environment of nuclear waste repositories. Specific issues investigated are the adequacy of current numerical codes in simulating geochemical interactions affecting radionuclide migration, the level of complexity required in chemical algorithms of transport models, and the validity of the constant-k/sub D/ concept in chemical transport modeling. An initial survey of the literature led to the conclusion that existing numerical codes did not encompass the full range of chemical and physical phenomena influential in radionuclide migration. Studies of chemical algorithms have been conducted within the framework of a one-dimensional numerical code that simulates the transport of chemically reacting solutes in a saturated porous medium. The code treats transport by dispersion/diffusion and advection, and equilibrium-controlled proceses of interphase mass transfer, complexation in the aqueous phase, pH variation, and precipitation/dissolution of secondary solids. Irreversible, time-dependent dissolution of solid phases during transport can be treated. Mass action, transport, and sorptive site constraint equations are expressed in differential/algebraic form and are solved simultaneously. Simulations using the code show that use of the constant-k/sub D/ concept can produce unreliable results in geochemical transport modeling. Applications to a field test and laboratory analogs of a nuclear waste repository indicate that a thermodynamically based simulator of chemical transport can successfully mimic real processes provided that operative chemical mechanisms and associated data have been correctly identified and measured, and have been incorporated in the simulator. 17 references, 10 figures

  9. Research on wind field algorithm of wind lidar based on BP neural network and grey prediction

    Science.gov (United States)

    Chen, Yong; Chen, Chun-Li; Luo, Xiong; Zhang, Yan; Yang, Ze-hou; Zhou, Jie; Shi, Xiao-ding; Wang, Lei

    2018-01-01

    This paper uses the BP neural network and grey algorithm to forecast and study radar wind field. In order to reduce the residual error in the wind field prediction which uses BP neural network and grey algorithm, calculating the minimum value of residual error function, adopting the residuals of the gray algorithm trained by BP neural network, using the trained network model to forecast the residual sequence, using the predicted residual error sequence to modify the forecast sequence of the grey algorithm. The test data show that using the grey algorithm modified by BP neural network can effectively reduce the residual value and improve the prediction precision.

  10. Application of XGBoost algorithm in hourly PM2.5 concentration prediction

    Science.gov (United States)

    Pan, Bingyue

    2018-02-01

    In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.

  11. Using linked electronic data to validate algorithms for health outcomes in administrative databases.

    Science.gov (United States)

    Lee, Wan-Ju; Lee, Todd A; Pickard, Alan Simon; Shoaibi, Azadeh; Schumock, Glen T

    2015-08-01

    The validity of algorithms used to identify health outcomes in claims-based and administrative data is critical to the reliability of findings from observational studies. The traditional approach to algorithm validation, using medical charts, is expensive and time-consuming. An alternative method is to link the claims data to an external, electronic data source that contains information allowing confirmation of the event of interest. In this paper, we describe this external linkage validation method and delineate important considerations to assess the feasibility and appropriateness of validating health outcomes using this approach. This framework can help investigators decide whether to pursue an external linkage validation method for identifying health outcomes in administrative/claims data.

  12. Proportional-Type Performance Recovery DC-Link Voltage Tracking Algorithm for Permanent Magnet Synchronous Generators

    Directory of Open Access Journals (Sweden)

    Seok-Kyoon Kim

    2017-09-01

    Full Text Available This study proposes a disturbance observer-based proportional-type DC-link voltage tracking algorithm for permanent magnet synchronous generators (PMSGs. The proposed technique feedbacks the only proportional term of the tracking errors, and it contains the nominal static and dynamic feed-forward compensators coming from the first-order disturbance observers. It is rigorously proved that the proposed method ensures the performance recovery and offset-free properties without the use of the integrators of the tracking errors. A wind power generation system has been simulated to verify the efficacy of the proposed method using the PSIM (PowerSIM software with the DLL (Dynamic Link Library block.

  13. Artificial Neural Network Algorithm for Condition Monitoring of DC-link Capacitors Based on Capacitance Estimation

    DEFF Research Database (Denmark)

    Soliman, Hammam Abdelaal Hammam; Wang, Huai; Gadalla, Brwene Salah Abdelkarim

    2015-01-01

    challenges. A capacitance estimation method based on Artificial Neural Network (ANN) algorithm is therefore proposed in this paper. The implemented ANN estimated the capacitance of the DC-link capacitor in a back-toback converter. Analysis of the error of the capacitance estimation is also given......In power electronic converters, reliability of DC-link capacitors is one of the critical issues. The estimation of their health status as an application of condition monitoring have been an attractive subject for industrial field and hence for the academic research filed as well. More reliable...... solutions are required to be adopted by the industry applications in which usage of extra hardware, increased cost, and low estimation accuracy are the main challenges. Therefore, development of new condition monitoring methods based on software solutions could be the new era that covers the aforementioned...

  14. Energy Link Optimization in a Wireless Power Transfer Grid under Energy Autonomy Based on the Improved Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Zhihao Zhao

    2016-08-01

    Full Text Available In this paper, an optimization method is proposed for the energy link in a wireless power transfer grid, which is a regional smart microgrid comprised of distributed devices equipped with wireless power transfer technology in a certain area. The relevant optimization model of the energy link is established by considering the wireless power transfer characteristics and the grid characteristics brought in by the device repeaters. Then, a concentration adaptive genetic algorithm (CAGA is proposed to optimize the energy link. The algorithm avoided the unification trend by introducing the concentration mechanism and a new crossover method named forward order crossover, as well as the adaptive parameter mechanism, which are utilized together to keep the diversity of the optimization solution groups. The results show that CAGA is feasible and competitive for the energy link optimization in different situations. This proposed algorithm performs better than its counterparts in the global convergence ability and the algorithm robustness.

  15. Comparison of predictive performance of data mining algorithms in predicting body weight in Mengali rams of Pakistan

    Directory of Open Access Journals (Sweden)

    Senol Celik

    Full Text Available ABSTRACT The present study aimed at comparing predictive performance of some data mining algorithms (CART, CHAID, Exhaustive CHAID, MARS, MLP, and RBF in biometrical data of Mengali rams. To compare the predictive capability of the algorithms, the biometrical data regarding body (body length, withers height, and heart girth and testicular (testicular length, scrotal length, and scrotal circumference measurements of Mengali rams in predicting live body weight were evaluated by most goodness of fit criteria. In addition, age was considered as a continuous independent variable. In this context, MARS data mining algorithm was used for the first time to predict body weight in two forms, without (MARS_1 and with interaction (MARS_2 terms. The superiority order in the predictive accuracy of the algorithms was found as CART > CHAID ≈ Exhaustive CHAID > MARS_2 > MARS_1 > RBF > MLP. Moreover, all tested algorithms provided a strong predictive accuracy for estimating body weight. However, MARS is the only algorithm that generated a prediction equation for body weight. Therefore, it is hoped that the available results might present a valuable contribution in terms of predicting body weight and describing the relationship between the body weight and body and testicular measurements in revealing breed standards and the conservation of indigenous gene sources for Mengali sheep breeding. Therefore, it will be possible to perform more profitable and productive sheep production. Use of data mining algorithms is useful for revealing the relationship between body weight and testicular traits in describing breed standards of Mengali sheep.

  16. A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors

    Directory of Open Access Journals (Sweden)

    Ricardo Acevedo-Avila

    2016-05-01

    Full Text Available Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.

  17. A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

    Science.gov (United States)

    Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

    2016-05-28

    Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.

  18. An improved simplified model predictive control algorithm and its application to a continuous fermenter

    Directory of Open Access Journals (Sweden)

    W. H. Kwong

    2000-06-01

    Full Text Available The development of a new simplified model predictive control algorithm has been proposed in this work. The algorithm is developed within the framework of internal model control, and it is easy to understanding and implement. Simulation results for a continuous fermenter, which show that the proposed control algorithm is robust for moderate variations in plant parameters, are presented. The algorithm shows a good performance for setpoint tracking.

  19. Algorithm of dynamic stabilization system for a car 4x4 with a link rear axle

    Directory of Open Access Journals (Sweden)

    M. M. Jileikin

    2014-01-01

    Full Text Available The slow development of active safety systems of the automobile all-wheel drive vehicles is the cause of lack of researches in the field of power distribution under the specific conditions of movement. The purpose of work is to develop methods to control a curvilinear motion of 4x4 cars with a link to the rear axle that provides the increase in directional and trajectory stability of the car. The paper analyses the known methods to increase wheeled vehicles movement stability. It also offers a method for power flow redistribution in the transmission of the car 4x4 with a link to the rear axle, providing the increase in directional and trajectory stability of the car.To study the performance and effectiveness of the proposed method a mathematical model of the moving car 4x4 with a link to the rear axle is developed. Simulation methods allowed us to establish the following:1. for car 4x4 with redistribution of torque between the driving axles in the range of 100:0 - 50:50 and with redistribution of torque between the wheels of the rear axle in the range of 0:100 the most effective are the stabilization algorithms used in combination “Lowing power consumption of the engine +Creation of stabilizing the moment due to the redistribution of torque on different wheels", providing the increase in directional and trajectory stability by 12...93%;2. for car 4x4 with redistribution of torque between the driving axles in the range 100:0 - 0:100 and with redistribution of torque between the wheels of the rear axle in the range of 0:100 the best option is a combination of algorithms "Lowing power consumption of the engine + Creation of stabilizing moment due to redistribution of torques on different wheels", providing the increase in directional and trajectory stability by 27...93%.A comparative analysis of algorithms efficiency of dynamic stabilization system operation for two-axle wheeled vehicles depending on the torque redistribution between the driving

  20. An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

    Science.gov (United States)

    Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

    2017-12-01

    Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. CAT-PUMA: CME Arrival Time Prediction Using Machine learning Algorithms

    Science.gov (United States)

    Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert

    2018-04-01

    CAT-PUMA (CME Arrival Time Prediction Using Machine learning Algorithms) quickly and accurately predicts the arrival of Coronal Mass Ejections (CMEs) of CME arrival time. The software was trained via detailed analysis of CME features and solar wind parameters using 182 previously observed geo-effective partial-/full-halo CMEs and uses algorithms of the Support Vector Machine (SVM) to make its predictions, which can be made within minutes of providing the necessary input parameters of a CME.

  2. Research on Demand Prediction of Fresh Food Supply Chain Based on Improved Particle Swarm Optimization Algorithm

    OpenAIRE

    He Wang

    2015-01-01

    Demand prediction of supply chain is an important content and the first premise in supply management of different enterprises and has become one of the difficulties and hot research fields for the researchers related. The paper takes fresh food demand prediction for example and presents a new algorithm for predicting demand of fresh food supply chain. First, the working principle and the root causes of the defects of particle swarm optimization algorithm are analyzed in the study; Second, the...

  3. Comparison of four Adaboost algorithm based artificial neural networks in wind speed predictions

    International Nuclear Information System (INIS)

    Liu, Hui; Tian, Hong-qi; Li, Yan-fei; Zhang, Lei

    2015-01-01

    Highlights: • Four hybrid algorithms are proposed for the wind speed decomposition. • Adaboost algorithm is adopted to provide a hybrid training framework. • MLP neural networks are built to do the forecasting computation. • Four important network training algorithms are included in the MLP networks. • All the proposed hybrid algorithms are suitable for the wind speed predictions. - Abstract: The technology of wind speed prediction is important to guarantee the safety of wind power utilization. In this paper, four different hybrid methods are proposed for the high-precision multi-step wind speed predictions based on the Adaboost (Adaptive Boosting) algorithm and the MLP (Multilayer Perceptron) neural networks. In the hybrid Adaboost–MLP forecasting architecture, four important algorithms are adopted for the training and modeling of the MLP neural networks, including GD-ALR-BP algorithm, GDM-ALR-BP algorithm, CG-BP-FR algorithm and BFGS algorithm. The aim of the study is to investigate the promoted forecasting percentages of the MLP neural networks by the Adaboost algorithm’ optimization under various training algorithms. The hybrid models in the performance comparison include Adaboost–GD-ALR-BP–MLP, Adaboost–GDM-ALR-BP–MLP, Adaboost–CG-BP-FR–MLP, Adaboost–BFGS–MLP, GD-ALR-BP–MLP, GDM-ALR-BP–MLP, CG-BP-FR–MLP and BFGS–MLP. Two experimental results show that: (1) the proposed hybrid Adaboost–MLP forecasting architecture is effective for the wind speed predictions; (2) the Adaboost algorithm has promoted the forecasting performance of the MLP neural networks considerably; (3) among the proposed Adaboost–MLP forecasting models, the Adaboost–CG-BP-FR–MLP model has the best performance; and (4) the improved percentages of the MLP neural networks by the Adaboost algorithm decrease step by step with the following sequence of training algorithms as: GD-ALR-BP, GDM-ALR-BP, CG-BP-FR and BFGS

  4. Condition Monitoring for DC-link Capacitors Based on Artificial Neural Network Algorithm

    DEFF Research Database (Denmark)

    Soliman, Hammam Abdelaal Hammam; Wang, Huai; Gadalla, Brwene Salah Abdelkarim

    2015-01-01

    hardware will reduce the cost, and therefore could be more promising for industry applications. A condition monitoring method based on Artificial Neural Network (ANN) algorithm is therefore proposed in this paper. The implementation of the ANN to the DC-link capacitor condition monitoring in a back......In power electronic systems, capacitor is one of the reliability critical components . Recently, the condition monitoring of capacitors to estimate their health status have been attracted by the academic research. Industry applications require more reliable power electronics products...... with preventive maintenance. However, the existing capacitor condition monitoring methods suffer from either increased hardware cost or low estimation accuracy, being the challenges to be adopted in industry applications. New development in condition monitoring technology with software solutions without extra...

  5. Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

    Science.gov (United States)

    Kenah, Eben; Britton, Tom; Halloran, M. Elizabeth; Longini, Ira M.

    2016-01-01

    Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology. PMID:27070316

  6. Algorithms and Methods for High-Performance Model Predictive Control

    DEFF Research Database (Denmark)

    Frison, Gianluca

    routines employed in the numerical tests. The main focus of this thesis is on linear MPC problems. In this thesis, both the algorithms and their implementation are equally important. About the implementation, a novel implementation strategy for the dense linear algebra routines in embedded optimization...... is proposed, aiming at improving the computational performance in case of small matrices. About the algorithms, they are built on top of the proposed linear algebra, and they are tailored to exploit the high-level structure of the MPC problems, with special care on reducing the computational complexity....

  7. Prediction of customer behaviour analysis using classification algorithms

    Science.gov (United States)

    Raju, Siva Subramanian; Dhandayudam, Prabha

    2018-04-01

    Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.

  8. Algorithms

    Indian Academy of Sciences (India)

    polynomial) division have been found in Vedic Mathematics which are dated much before Euclid's algorithm. A programming language Is used to describe an algorithm for execution on a computer. An algorithm expressed using a programming.

  9. Increasing Prediction the Original Final Year Project of Student Using Genetic Algorithm

    Science.gov (United States)

    Saragih, Rijois Iboy Erwin; Turnip, Mardi; Sitanggang, Delima; Aritonang, Mendarissan; Harianja, Eva

    2018-04-01

    Final year project is very important forgraduation study of a student. Unfortunately, many students are not seriouslydidtheir final projects. Many of studentsask for someone to do it for them. In this paper, an application of genetic algorithms to predict the original final year project of a studentis proposed. In the simulation, the data of the final project for the last 5 years is collected. The genetic algorithm has several operators namely population, selection, crossover, and mutation. The result suggest that genetic algorithm can do better prediction than other comparable model. Experimental results of predicting showed that 70% was more accurate than the previous researched.

  10. Non-linear multivariable predictive control of an alcoholic fermentation process using functional link networks

    Directory of Open Access Journals (Sweden)

    Luiz Augusto da Cruz Meleiro

    2005-06-01

    Full Text Available In this work a MIMO non-linear predictive controller was developed for an extractive alcoholic fermentation process. The internal model of the controller was represented by two MISO Functional Link Networks (FLNs, identified using simulated data generated from a deterministic mathematical model whose kinetic parameters were determined experimentally. The FLN structure presents as advantages fast training and guaranteed convergence, since the estimation of the weights is a linear optimization problem. Besides, the elimination of non-significant weights generates parsimonious models, which allows for fast execution in an MPC-based algorithm. The proposed algorithm showed good potential in identification and control of non-linear processes.Neste trabalho um controlador preditivo não linear multivariável foi desenvolvido para um processo de fermentação alcoólica extrativa. O modelo interno do controlador foi representado por duas redes do tipo Functional Link (FLN, identificadas usando dados de simulação gerados a partir de um modelo validado experimentalmente. A estrutura FLN apresenta como vantagem o treinamento rápido e convergência garantida, já que a estimação dos seus pesos é um problema de otimização linear. Além disso, a eliminação de pesos não significativos gera modelos parsimoniosos, o que permite a rápida execução em algoritmos de controle preditivo baseado em modelo. Os resultados mostram que o algoritmo proposto tem grande potencial para identificação e controle de processos não lineares.

  11. A prediction algorithm for first onset of major depression in the general population: development and validation.

    Science.gov (United States)

    Wang, JianLi; Sareen, Jitender; Patten, Scott; Bolton, James; Schmitz, Norbert; Birney, Arden

    2014-05-01

    Prediction algorithms are useful for making clinical decisions and for population health planning. However, such prediction algorithms for first onset of major depression do not exist. The objective of this study was to develop and validate a prediction algorithm for first onset of major depression in the general population. Longitudinal study design with approximate 3-year follow-up. The study was based on data from a nationally representative sample of the US general population. A total of 28 059 individuals who participated in Waves 1 and 2 of the US National Epidemiologic Survey on Alcohol and Related Conditions and who had not had major depression at Wave 1 were included. The prediction algorithm was developed using logistic regression modelling in 21 813 participants from three census regions. The algorithm was validated in participants from the 4th census region (n=6246). Major depression occurred since Wave 1 of the National Epidemiologic Survey on Alcohol and Related Conditions, assessed by the Alcohol Use Disorder and Associated Disabilities Interview Schedule-diagnostic and statistical manual for mental disorders IV. A prediction algorithm containing 17 unique risk factors was developed. The algorithm had good discriminative power (C statistics=0.7538, 95% CI 0.7378 to 0.7699) and excellent calibration (F-adjusted test=1.00, p=0.448) with the weighted data. In the validation sample, the algorithm had a C statistic of 0.7259 and excellent calibration (Hosmer-Lemeshow χ(2)=3.41, p=0.906). The developed prediction algorithm has good discrimination and calibration capacity. It can be used by clinicians, mental health policy-makers and service planners and the general public to predict future risk of having major depression. The application of the algorithm may lead to increased personalisation of treatment, better clinical decisions and more optimal mental health service planning.

  12. An Improved User Selection Algorithm in Multiuser MIMO Broadcast with Channel Prediction

    Science.gov (United States)

    Min, Zhi; Ohtsuki, Tomoaki

    In multiuser MIMO-BC (Multiple-Input Multiple-Output Broadcasting) systems, user selection is important to achieve multiuser diversity. The optimal user selection algorithm is to try all the combinations of users to find the user group that can achieve the multiuser diversity. Unfortunately, the high calculation cost of the optimal algorithm prevents its implementation. Thus, instead of the optimal algorithm, some suboptimal user selection algorithms were proposed based on semiorthogonality of user channel vectors. The purpose of this paper is to achieve multiuser diversity with a small amount of calculation. For this purpose, we propose a user selection algorithm that can improve the orthogonality of a selected user group. We also apply a channel prediction technique to a MIMO-BC system to get more accurate channel information at the transmitter. Simulation results show that the channel prediction can improve the accuracy of channel information for user selections, and the proposed user selection algorithm achieves higher sum rate capacity than the SUS (Semiorthogonal User Selection) algorithm. Also we discuss the setting of the algorithm threshold. As the result of a discussion on the calculation complexity, which uses the number of complex multiplications as the parameter, the proposed algorithm is shown to have a calculation complexity almost equal to that of the SUS algorithm, and they are much lower than that of the optimal user selection algorithm.

  13. Effectiveness of link prediction for face-to-face behavioral networks.

    Science.gov (United States)

    Tsugawa, Sho; Ohsaki, Hiroyuki

    2013-01-01

    Research on link prediction for social networks has been actively pursued. In link prediction for a given social network obtained from time-windowed observation, new link formation in the network is predicted from the topology of the obtained network. In contrast, recent advances in sensing technology have made it possible to obtain face-to-face behavioral networks, which are social networks representing face-to-face interactions among people. However, the effectiveness of link prediction techniques for face-to-face behavioral networks has not yet been explored in depth. To clarify this point, here we investigate the accuracy of conventional link prediction techniques for networks obtained from the history of face-to-face interactions among participants at an academic conference. Our findings were (1) that conventional link prediction techniques predict new link formation with a precision of 0.30-0.45 and a recall of 0.10-0.20, (2) that prolonged observation of social networks often degrades the prediction accuracy, (3) that the proposed decaying weight method leads to higher prediction accuracy than can be achieved by observing all records of communication and simply using them unmodified, and (4) that the prediction accuracy for face-to-face behavioral networks is relatively high compared to that for non-social networks, but not as high as for other types of social networks.

  14. Constrained Active Learning for Anchor Link Prediction Across Multiple Heterogeneous Social Networks.

    Science.gov (United States)

    Zhu, Junxing; Zhang, Jiawei; Wu, Quanyuan; Jia, Yan; Zhou, Bin; Wei, Xiaokai; Yu, Philip S

    2017-08-03

    Nowadays, people are usually involved in multiple heterogeneous social networks simultaneously. Discovering the anchor links between the accounts owned by the same users across different social networks is crucial for many important inter-network applications, e.g., cross-network link transfer and cross-network recommendation. Many different supervised models have been proposed to predict anchor links so far, but they are effective only when the labeled anchor links are abundant. However, in real scenarios, such a requirement can hardly be met and most anchor links are unlabeled, since manually labeling the inter-network anchor links is quite costly and tedious. To overcome such a problem and utilize the numerous unlabeled anchor links in model building, in this paper, we introduce the active learning based anchor link prediction problem. Different from the traditional active learning problems, due to the one-to-one constraint on anchor links, if an unlabeled anchor link a = ( u , v ) is identified as positive (i.e., existing), all the other unlabeled anchor links incident to account u or account v will be negative (i.e., non-existing) automatically. Viewed in such a perspective, asking for the labels of potential positive anchor links in the unlabeled set will be rewarding in the active anchor link prediction problem. Various novel anchor link information gain measures are defined in this paper, based on which several constraint active anchor link prediction methods are introduced. Extensive experiments have been done on real-world social network datasets to compare the performance of these methods with state-of-art anchor link prediction methods. The experimental results show that the proposed Mean-entropy-based Constrained Active Learning (MC) method can outperform other methods with significant advantages.

  15. To trade or not to trade: Link prediction in the virtual water network

    Science.gov (United States)

    Tuninetti, Marta; Tamea, Stefania; Laio, Francesco; Ridolfi, Luca

    2017-12-01

    In the international trade network, links express the (temporary) presence of a commercial exchange of goods between any two countries. Given the dynamical behaviour of the trade network, where links are created and dismissed every year, predicting the link activation/deactivation is an open research question. Through the international trade network of agricultural goods, water resources are 'virtually' transferred from the country of production to the country of consumption. We propose a novel methodology for link prediction applied to the network of virtual water trade. Starting from the assumption of having links between any two countries, we estimate the associated virtual water flows by means of a gravity-law model using country and link characteristics as drivers. We consider the links with estimated flows higher than 1000 m3/year as active links, while the others as non-active links. Flows traded along estimated active links are then re-estimated using a similar but differently-calibrated gravity-law model. We were able to correctly model 84% of the existing links and 93% of the non-existing links in year 2011. It is worth to note that the predicted active links carry 99% of the global virtual water flow; hence, missed links are mainly those where a minimum volume of virtual water is exchanged. Results indicate that, over the period from 1986 to 2011, population, geographical distances between countries, and agricultural efficiency (through fertilizers use) are the major factors driving the link activation and deactivation. As opposed to other (network-based) models for link prediction, the proposed method is able to reconstruct the network architecture without any prior knowledge of the network topology, using only the nodes and links attributes; it thus represents a general method that can be applied to other networks such as food or value trade networks.

  16. Verification and improvement of predictive algorithms for radionuclide migration

    International Nuclear Information System (INIS)

    Carnahan, C.L.; Miller, C.W.; Remer, J.S.

    1984-01-01

    This research investigated the adequacy of current numerical codes in simulating geochemical interactions affecting radionuclide migration, the level of complexity required in chemical algorithms of transport models, and the validity of the constant-k/sub D/ concept in chemical transport modeling. An initial survey of the literature led to the conclusion that existing numerical codes did not encompass the full range of chemical and physical phenomena influential in radionuclide migration

  17. Prediction and Analysis of students Behavior using BARC Algorithm

    OpenAIRE

    M.Sindhuja; Dr.S.Rajalakshmi; S.M.Nandagopal

    2013-01-01

    Educational Data mining is a recent trends where data mining methods are experimented for the improvement of student performance in academics. The work describes the mining of higher education students’ related attributes such as behavior, attitude and relationship. The data were collected from a higher education institution in terms of the mentioned attributes. The proposed work explored Behavior Attitude Relationship Clustering (BARC) Algorithm, which showed the improvement in students’ per...

  18. Novel prediction- and subblock-based algorithm for fractal image compression

    International Nuclear Information System (INIS)

    Chung, K.-L.; Hsu, C.-H.

    2006-01-01

    Fractal encoding is the most consuming part in fractal image compression. In this paper, a novel two-phase prediction- and subblock-based fractal encoding algorithm is presented. Initially the original gray image is partitioned into a set of variable-size blocks according to the S-tree- and interpolation-based decomposition principle. In the first phase, each current block of variable-size range block tries to find the best matched domain block based on the proposed prediction-based search strategy which utilizes the relevant neighboring variable-size domain blocks. The first phase leads to a significant computation-saving effect. If the domain block found within the predicted search space is unacceptable, in the second phase, a subblock strategy is employed to partition the current variable-size range block into smaller blocks to improve the image quality. Experimental results show that our proposed prediction- and subblock-based fractal encoding algorithm outperforms the conventional full search algorithm and the recently published spatial-correlation-based algorithm by Truong et al. in terms of encoding time and image quality. In addition, the performance comparison among our proposed algorithm and the other two algorithms, the no search-based algorithm and the quadtree-based algorithm, are also investigated

  19. A Decomposition Algorithm for Mean-Variance Economic Model Predictive Control of Stochastic Linear Systems

    DEFF Research Database (Denmark)

    Sokoler, Leo Emil; Dammann, Bernd; Madsen, Henrik

    2014-01-01

    This paper presents a decomposition algorithm for solving the optimal control problem (OCP) that arises in Mean-Variance Economic Model Predictive Control of stochastic linear systems. The algorithm applies the alternating direction method of multipliers to a reformulation of the OCP...

  20. Gas Emission Prediction Model of Coal Mine Based on CSBP Algorithm

    Directory of Open Access Journals (Sweden)

    Xiong Yan

    2016-01-01

    Full Text Available In view of the nonlinear characteristics of gas emission in a coal working face, a prediction method is proposed based on cuckoo search algorithm optimized BP neural network (CSBP. In the CSBP algorithm, the cuckoo search is adopted to optimize weight and threshold parameters of BP network, and obtains the global optimal solutions. Furthermore, the twelve main affecting factors of the gas emission in the coal working face are taken as input vectors of CSBP algorithm, the gas emission is acted as output vector, and then the prediction model of BP neural network with optimal parameters is established. The results show that the CSBP algorithm has batter generalization ability and higher prediction accuracy, and can be utilized effectively in the prediction of coal mine gas emission.

  1. Training the Recurrent neural network by the Fuzzy Min-Max algorithm for fault prediction

    International Nuclear Information System (INIS)

    Zemouri, Ryad; Racoceanu, Daniel; Zerhouni, Noureddine; Minca, Eugenia; Filip, Florin

    2009-01-01

    In this paper, we present a training technique of a Recurrent Radial Basis Function neural network for fault prediction. We use the Fuzzy Min-Max technique to initialize the k-center of the RRBF neural network. The k-means algorithm is then applied to calculate the centers that minimize the mean square error of the prediction task. The performances of the k-means algorithm are then boosted by the Fuzzy Min-Max technique.

  2. An adaptive prediction and detection algorithm for multistream syndromic surveillance

    Directory of Open Access Journals (Sweden)

    Magruder Steve F

    2005-10-01

    Full Text Available Abstract Background Surveillance of Over-the-Counter pharmaceutical (OTC sales as a potential early indicator of developing public health conditions, in particular in cases of interest to biosurvellance, has been suggested in the literature. This paper is a continuation of a previous study in which we formulated the problem of estimating clinical data from OTC sales in terms of optimal LMS linear and Finite Impulse Response (FIR filters. In this paper we extend our results to predict clinical data multiple steps ahead using OTC sales as well as the clinical data itself. Methods The OTC data are grouped into a few categories and we predict the clinical data using a multichannel filter that encompasses all the past OTC categories as well as the past clinical data itself. The prediction is performed using FIR (Finite Impulse Response filters and the recursive least squares method in order to adapt rapidly to nonstationary behaviour. In addition, we inject simulated events in both clinical and OTC data streams to evaluate the predictions by computing the Receiver Operating Characteristic curves of a threshold detector based on predicted outputs. Results We present all prediction results showing the effectiveness of the combined filtering operation. In addition, we compute and present the performance of a detector using the prediction output. Conclusion Multichannel adaptive FIR least squares filtering provides a viable method of predicting public health conditions, as represented by clinical data, from OTC sales, and/or the clinical data. The potential value to a biosurveillance system cannot, however, be determined without studying this approach in the presence of transient events (nonstationary events of relatively short duration and fast rise times. Our simulated events superimposed on actual OTC and clinical data allow us to provide an upper bound on that potential value under some restricted conditions. Based on our ROC curves we argue that a

  3. Geometric Semantic Genetic Programming Algorithm and Slump Prediction

    OpenAIRE

    Xu, Juncai; Shen, Zhenzhong; Ren, Qingwen; Xie, Xin; Yang, Zhengyu

    2017-01-01

    Research on the performance of recycled concrete as building material in the current world is an important subject. Given the complex composition of recycled concrete, conventional methods for forecasting slump scarcely obtain satisfactory results. Based on theory of nonlinear prediction method, we propose a recycled concrete slump prediction model based on geometric semantic genetic programming (GSGP) and combined it with recycled concrete features. Tests show that the model can accurately p...

  4. Prediction of Employee Turnover in Organizations using Machine Learning Algorithms

    OpenAIRE

    Rohit Punnoose; Pankaj Ajit

    2016-01-01

    Employee turnover has been identified as a key issue for organizations because of its adverse impact on work place productivity and long term growth strategies. To solve this problem, organizations use machine learning techniques to predict employee turnover. Accurate predictions enable organizations to take action for retention or succession planning of employees. However, the data for this modeling problem comes from HR Information Systems (HRIS); these are typically under-funded compared t...

  5. A Multiple Model Prediction Algorithm for CNC Machine Wear PHM

    Directory of Open Access Journals (Sweden)

    Huimin Chen

    2011-01-01

    Full Text Available The 2010 PHM data challenge focuses on the remaining useful life (RUL estimation for cutters of a high speed CNC milling machine using measurements from dynamometer, accelerometer, and acoustic emission sensors. We present a multiple model approach for wear depth estimation of milling machine cutters using the provided data. The feature selection, initial wear estimation and multiple model fusion components of the proposed algorithm are explained in details and compared with several alternative methods using the training data. The final submission ranked #2 among professional and student participants and the method is applicable to other data driven PHM problems.

  6. Appropriate Combination of Artificial Intelligence and Algorithms for Increasing Predictive Accuracy Management

    Directory of Open Access Journals (Sweden)

    Shahram Gilani Nia

    2010-03-01

    Full Text Available In this paper a simple and effective expert system to predict random data fluctuation in short-term period is established. Evaluation process includes introducing Fourier series, Markov chain model prediction and comparison (Gray combined with the model prediction Gray- Fourier- Markov that the mixed results, to create an expert system predicted with artificial intelligence, made this model to predict the effectiveness of random fluctuation in most data management programs to increase. The outcome of this study introduced artificial intelligence algorithms that help detect that the computer environment to create a system that experts predict the short-term and unstable situation happens correctly and accurately predict. To test the effectiveness of the algorithm presented studies (Chen Tzay len,2008, and predicted data of tourism demand for Iran model is used. Results for the two countries show output model has high accuracy.

  7. Model Predictive Control Algorithms for Pen and Pump Insulin Administration

    DEFF Research Database (Denmark)

    Boiroux, Dimitri

    at mealtime, and the case where the insulin sensitivity increases during the night. This thesis consists of a summary report, glucose and insulin proles of the clinical studies and research papers submitted, peer-reviewed and/or published in the period September 2009 - September 2012....... of current closed-loop controllers. In this thesis, we present different control strategies based on Model Predictive Control (MPC) for an artificial pancreas. We use Nonlinear Model Predictive Control (NMPC) in order to determine the optimal insulin and blood glucose profiles. The optimal control problem...

  8. Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life

    International Nuclear Information System (INIS)

    Hu Chao; Youn, Byeng D.; Wang Pingfeng; Taek Yoon, Joung

    2012-01-01

    Prognostics aims at determining whether a failure of an engineered system (e.g., a nuclear power plant) is impending and estimating the remaining useful life (RUL) before the failure occurs. The traditional data-driven prognostic approach is to construct multiple candidate algorithms using a training data set, evaluate their respective performance using a testing data set, and select the one with the best performance while discarding all the others. This approach has three shortcomings: (i) the selected standalone algorithm may not be robust; (ii) it wastes the resources for constructing the algorithms that are discarded; (iii) it requires the testing data in addition to the training data. To overcome these drawbacks, this paper proposes an ensemble data-driven prognostic approach which combines multiple member algorithms with a weighted-sum formulation. Three weighting schemes, namely the accuracy-based weighting, diversity-based weighting and optimization-based weighting, are proposed to determine the weights of member algorithms. The k-fold cross validation (CV) is employed to estimate the prediction error required by the weighting schemes. The results obtained from three case studies suggest that the ensemble approach with any weighting scheme gives more accurate RUL predictions compared to any sole algorithm when member algorithms producing diverse RUL predictions have comparable prediction accuracy and that the optimization-based weighting scheme gives the best overall performance among the three weighting schemes.

  9. Application of Functional Link Artificial Neural Network for Prediction of Machinery Noise in Opencast Mines

    Directory of Open Access Journals (Sweden)

    Santosh Kumar Nanda

    2011-01-01

    Full Text Available Functional link-based neural network models were applied to predict opencast mining machineries noise. The paper analyzes the prediction capabilities of functional link neural network based noise prediction models vis-à-vis existing statistical models. In order to find the actual noise status in opencast mines, some of the popular noise prediction models, for example, ISO-9613-2, CONCAWE, VDI, and ENM, have been applied in mining and allied industries to predict the machineries noise by considering various attenuation factors. Functional link artificial neural network (FLANN, polynomial perceptron network (PPN, and Legendre neural network (LeNN were used to predict the machinery noise in opencast mines. The case study is based on data collected from an opencast coal mine of Orissa, India. From the present investigations, it could be concluded that the FLANN model give better noise prediction than the PPN and LeNN model.

  10. Centrality Robustness and Link Prediction in Complex Social Networks

    DEFF Research Database (Denmark)

    Davidsen, Søren Atmakuri; Ortiz-Arroyo, Daniel

    2012-01-01

    . Secondly, we present a method to predict edges in dynamic social networks. Our experimental results indicate that the robustness of the centrality measures applied to more realistic social networks follows a predictable pattern and that the use of temporal statistics could improve the accuracy achieved......This chapter addresses two important issues in social network analysis that involve uncertainty. Firstly, we present am analysis on the robustness of centrality measures that extend the work presented in Borgati et al. using three types of complex network structures and one real social network...

  11. An application of earthquake prediction algorithm M8 in eastern ...

    Indian Academy of Sciences (India)

    2Institute of Earthquake Prediction Theory and Mathematical Geophysics, ... located about 70 km from a preceding M7.3 earthquake that occurred in ... local extremes of the seismic density distribution, and in the third approach, CI centers were distributed ...... Bird P 2003 An updated digital model of plate boundaries;.

  12. A unified algorithm for predicting partition coefficients for PBPK modeling of drugs and environmental chemicals

    International Nuclear Information System (INIS)

    Peyret, Thomas; Poulin, Patrick; Krishnan, Kannan

    2010-01-01

    The algorithms in the literature focusing to predict tissue:blood PC (P tb ) for environmental chemicals and tissue:plasma PC based on total (K p ) or unbound concentration (K pu ) for drugs differ in their consideration of binding to hemoglobin, plasma proteins and charged phospholipids. The objective of the present study was to develop a unified algorithm such that P tb , K p and K pu for both drugs and environmental chemicals could be predicted. The development of the unified algorithm was accomplished by integrating all mechanistic algorithms previously published to compute the PCs. Furthermore, the algorithm was structured in such a way as to facilitate predictions of the distribution of organic compounds at the macro (i.e. whole tissue) and micro (i.e. cells and fluids) levels. The resulting unified algorithm was applied to compute the rat P tb , K p or K pu of muscle (n = 174), liver (n = 139) and adipose tissue (n = 141) for acidic, neutral, zwitterionic and basic drugs as well as ketones, acetate esters, alcohols, aliphatic hydrocarbons, aromatic hydrocarbons and ethers. The unified algorithm reproduced adequately the values predicted previously by the published algorithms for a total of 142 drugs and chemicals. The sensitivity analysis demonstrated the relative importance of the various compound properties reflective of specific mechanistic determinants relevant to prediction of PC values of drugs and environmental chemicals. Overall, the present unified algorithm uniquely facilitates the computation of macro and micro level PCs for developing organ and cellular-level PBPK models for both chemicals and drugs.

  13. A comprehensive performance evaluation on the prediction results of existing cooperative transcription factors identification algorithms.

    Science.gov (United States)

    Lai, Fu-Jou; Chang, Hong-Tsun; Huang, Yueh-Min; Wu, Wei-Sheng

    2014-01-01

    Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new

  14. Algorithms

    Indian Academy of Sciences (India)

    to as 'divide-and-conquer'. Although there has been a large effort in realizing efficient algorithms, there are not many universally accepted algorithm design paradigms. In this article, we illustrate algorithm design techniques such as balancing, greedy strategy, dynamic programming strategy, and backtracking or traversal of ...

  15. Prediction of GPCR-Ligand Binding Using Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Sangmin Seo

    2018-01-01

    Full Text Available We propose a novel method that predicts binding of G-protein coupled receptors (GPCRs and ligands. The proposed method uses hub and cycle structures of ligands and amino acid motif sequences of GPCRs, rather than the 3D structure of a receptor or similarity of receptors or ligands. The experimental results show that these new features can be effective in predicting GPCR-ligand binding (average area under the curve [AUC] of 0.944, because they are thought to include hidden properties of good ligand-receptor binding. Using the proposed method, we were able to identify novel ligand-GPCR bindings, some of which are supported by several studies.

  16. Adaboost Ensemble with Simple Genetic Algorithm for Student Prediction Mode

    OpenAIRE

    AhmedSharaf ElDen; ElDen1Malaka A. Moustafa2Hany; M. Harb; AbdelH.Emara

    2013-01-01

    Predicting the student performance is a great concern to the higher education managements.Thisprediction helps to identify and to improve students' performance.Several factors may improve thisperformance.In the present study, we employ the data mining processes, particularly classification, toenhance the quality of the higher educational system. Recently, a new direction is used for the improvementof the classification accuracy by combining classifiers.In thispaper, we design and evaluate a f...

  17. A Novel Grey Prediction Model Combining Markov Chain with Functional-Link Net and Its Application to Foreign Tourist Forecasting

    Directory of Open Access Journals (Sweden)

    Yi-Chung Hu

    2017-10-01

    Full Text Available Grey prediction models for time series have been widely applied to demand forecasting because only limited data are required for them to build a time series model without any statistical assumptions. Previous studies have demonstrated that the combination of grey prediction with neural networks helps grey prediction perform better. Some methods have been presented to improve the prediction accuracy of the popular GM(1,1 model by using the Markov chain to estimate the residual needed to modify a predicted value. Compared to the previous Grey-Markov models, this study contributes to apply the functional-link net to estimate the degree to which a predicted value obtained from the GM(1,1 model can be adjusted. Furthermore, the troublesome number of states and their bounds that are not easily specified in Markov chain have been determined by a genetic algorithm. To verify prediction performance, the proposed grey prediction model was applied to an important grey system problem—foreign tourist forecasting. Experimental results show that the proposed model provides satisfactory results compared to the other Grey-Markov models considered.

  18. Reliability-Based Design Optimization of Trusses with Linked-Discrete Design Variables using the Improved Firefly Algorithm

    Directory of Open Access Journals (Sweden)

    N. M. Okasha

    2016-04-01

    Full Text Available In this paper, an approach for conducting a Reliability-Based Design Optimization (RBDO of truss structures with linked-discrete design variables is proposed. The sections of the truss members are selected from the AISC standard tables and thus the design variables that represent the properties of each section are linked. Latin hypercube sampling is used in the evaluation of the structural reliability. The improved firefly algorithm is used for the optimization solution process. It was found that in order to use the improved firefly algorithm for efficiently solving problems of reliability-based design optimization with linked-discrete design variables; it needs to be modified as proposed in this paper to accelerate its convergence.

  19. An outlook on robust model predictive control algorithms : Reflections on performance and computational aspects

    NARCIS (Netherlands)

    Saltik, M.B.; Özkan, L.; Ludlage, J.H.A.; Weiland, S.; Van den Hof, P.M.J.

    2018-01-01

    In this paper, we discuss the model predictive control algorithms that are tailored for uncertain systems. Robustness notions with respect to both deterministic (or set based) and stochastic uncertainties are discussed and contributions are reviewed in the model predictive control literature. We

  20. A new algorithm predicts pressure and temperature profiles of gas/gas-condensate transmission pipelines

    Energy Technology Data Exchange (ETDEWEB)

    Mokhatab, Saied [OIEC - Oil Industries' Engineering and Construction Group, Tehran (Iran, Islamic Republic of); Vatani, Ali [University of Tehran (Iran, Islamic Republic of)

    2003-07-01

    The main objective of the present study has been the development of a relatively simple analytical algorithm for predicting flow temperature and pressure profiles along the two-phase, gas/gas-condensate transmission pipelines. Results demonstrate the ability of the method to predict reasonably accurate pressure gradient and temperature gradient profiles under operating conditions. (author)

  1. A rank-based Prediction Algorithm of Learning User's Intention

    Science.gov (United States)

    Shen, Jie; Gao, Ying; Chen, Cang; Gong, HaiPing

    Internet search has become an important part in people's daily life. People can find many types of information to meet different needs through search engines on the Internet. There are two issues for the current search engines: first, the users should predetermine the types of information they want and then change to the appropriate types of search engine interfaces. Second, most search engines can support multiple kinds of search functions, each function has its own separate search interface. While users need different types of information, they must switch between different interfaces. In practice, most queries are corresponding to various types of information results. These queries can search the relevant results in various search engines, such as query "Palace" contains the websites about the introduction of the National Palace Museum, blog, Wikipedia, some pictures and video information. This paper presents a new aggregative algorithm for all kinds of search results. It can filter and sort the search results by learning three aspects about the query words, search results and search history logs to achieve the purpose of detecting user's intention. Experiments demonstrate that this rank-based method for multi-types of search results is effective. It can meet the user's search needs well, enhance user's satisfaction, provide an effective and rational model for optimizing search engines and improve user's search experience.

  2. A First-order Prediction-Correction Algorithm for Time-varying (Constrained) Optimization: Preprint

    Energy Technology Data Exchange (ETDEWEB)

    Dall-Anese, Emiliano [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Simonetto, Andrea [Universite catholique de Louvain

    2017-07-25

    This paper focuses on the design of online algorithms based on prediction-correction steps to track the optimal solution of a time-varying constrained problem. Existing prediction-correction methods have been shown to work well for unconstrained convex problems and for settings where obtaining the inverse of the Hessian of the cost function can be computationally affordable. The prediction-correction algorithm proposed in this paper addresses the limitations of existing methods by tackling constrained problems and by designing a first-order prediction step that relies on the Hessian of the cost function (and do not require the computation of its inverse). Analytical results are established to quantify the tracking error. Numerical simulations corroborate the analytical results and showcase performance and benefits of the algorithms.

  3. Application of Machine Learning Algorithms for the Query Performance Prediction

    Directory of Open Access Journals (Sweden)

    MILICEVIC, M.

    2015-08-01

    Full Text Available This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution.

  4. Exploring the significance of human mobility patterns in social link prediction

    KAUST Repository

    Alharbi, Basma Mohammed

    2014-01-01

    Link prediction is a fundamental task in social networks. Recently, emphasis has been placed on forecasting new social ties using user mobility patterns, e.g., investigating physical and semantic co-locations for new proximity measure. This paper explores the effect of in-depth mobility patterns. Specifically, we study individuals\\' movement behavior, and quantify mobility on the basis of trip frequency, travel purpose and transportation mode. Our hybrid link prediction model is composed of two modules. The first module extracts mobility patterns, including travel purpose and mode, from raw trajectory data. The second module employs the extracted patterns for link prediction. We evaluate our method on two real data sets, GeoLife [15] and Reality Mining [5]. Experimental results show that our hybrid model significantly improves the accuracy of social link prediction, when comparing to primary topology-based solutions. Copyright 2014 ACM.

  5. Demonstration of Linked UAV Observations and Atmospheric Model Predictions in Chem/Bio Attack Response

    National Research Council Canada - National Science Library

    Davidson, Kenneth

    2003-01-01

    ... meteorological data, and the means for linking the UAV data to real-time dispersion prediction. The primary modeling effort focused on an adaptation of the 'Wind On Constant Streamline Surfaces...

  6. Exploring the significance of human mobility patterns in social link prediction

    KAUST Repository

    Alharbi, Basma Mohammed; Zhang, Xiangliang

    2014-01-01

    Link prediction is a fundamental task in social networks. Recently, emphasis has been placed on forecasting new social ties using user mobility patterns, e.g., investigating physical and semantic co-locations for new proximity measure. This paper

  7. Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.

    Science.gov (United States)

    Crichton, Gamal; Guo, Yufan; Pyysalo, Sampo; Korhonen, Anna

    2018-05-21

    Link prediction in biomedical graphs has several important applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor on random- and time-sliced biomedical graphs of real-world sizes (∼ 6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics. In random- and time-sliced experiments when the neural network methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼ 15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼ 0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them. Our results indicate

  8. Predicting patchy particle crystals: variable box shape simulations and evolutionary algorithms.

    Science.gov (United States)

    Bianchi, Emanuela; Doppelbauer, Günther; Filion, Laura; Dijkstra, Marjolein; Kahl, Gerhard

    2012-06-07

    We consider several patchy particle models that have been proposed in literature and we investigate their candidate crystal structures in a systematic way. We compare two different algorithms for predicting crystal structures: (i) an approach based on Monte Carlo simulations in the isobaric-isothermal ensemble and (ii) an optimization technique based on ideas of evolutionary algorithms. We show that the two methods are equally successful and provide consistent results on crystalline phases of patchy particle systems.

  9. Adaptive algorithm for predicting increases in central loads of electrical energy systems

    Energy Technology Data Exchange (ETDEWEB)

    Arbachyauskene, N A; Pushinaytis, K V

    1982-01-01

    An adaptive algorithm for predicting increases in central loads of the electrical energy system is suggested for the task of evaluating the condition. The algorithm is based on the Kalman filter. In order to calculate the coefficient of intensification, the a priori assigned noise characteristics with low accuracy are used only in the beginning of the calculation. Further, the coefficient of intensification is calculated from the innovation sequence. This approach makes it possible to correct errors in the assignment of the statistical noise characteristics and to follow their changes. The algorithm is experimentally verified.

  10. A time series based sequence prediction algorithm to detect activities of daily living in smart home.

    Science.gov (United States)

    Marufuzzaman, M; Reaz, M B I; Ali, M A M; Rahman, L F

    2015-01-01

    The goal of smart homes is to create an intelligent environment adapting the inhabitants need and assisting the person who needs special care and safety in their daily life. This can be reached by collecting the ADL (activities of daily living) data and further analysis within existing computing elements. In this research, a very recent algorithm named sequence prediction via enhanced episode discovery (SPEED) is modified and in order to improve accuracy time component is included. The modified SPEED or M-SPEED is a sequence prediction algorithm, which modified the previous SPEED algorithm by using time duration of appliance's ON-OFF states to decide the next state. M-SPEED discovered periodic episodes of inhabitant behavior, trained it with learned episodes, and made decisions based on the obtained knowledge. The results showed that M-SPEED achieves 96.8% prediction accuracy, which is better than other time prediction algorithms like PUBS, ALZ with temporal rules and the previous SPEED. Since human behavior shows natural temporal patterns, duration times can be used to predict future events more accurately. This inhabitant activity prediction system will certainly improve the smart homes by ensuring safety and better care for elderly and handicapped people.

  11. A Wavelet Analysis-Based Dynamic Prediction Algorithm to Network Traffic

    Directory of Open Access Journals (Sweden)

    Meng Fan-Bo

    2016-01-01

    Full Text Available Network traffic is a significantly important parameter for network traffic engineering, while it holds highly dynamic nature in the network. Accordingly, it is difficult and impossible to directly predict traffic amount of end-to-end flows. This paper proposes a new prediction algorithm to network traffic using the wavelet analysis. Firstly, network traffic is converted into the time-frequency domain to capture time-frequency feature of network traffic. Secondly, in different frequency components, we model network traffic in the time-frequency domain. Finally, we build the prediction model about network traffic. At the same time, the corresponding prediction algorithm is presented to attain network traffic prediction. Simulation results indicates that our approach is promising.

  12. Advertisement Click-Through Rate Prediction Based on the Weighted-ELM and Adaboost Algorithm

    Directory of Open Access Journals (Sweden)

    Sen Zhang

    2017-01-01

    Full Text Available Accurate click-through rate (CTR prediction can not only improve the advertisement company’s reputation and revenue, but also help the advertisers to optimize the advertising performance. There are two main unsolved problems of the CTR prediction: low prediction accuracy due to the imbalanced distribution of the advertising data and the lack of the real-time advertisement bidding implementation. In this paper, we will develop a novel online CTR prediction approach by incorporating the real-time bidding (RTB advertising by the following strategies: user profile system is constructed from the historical data of the RTB advertising to describe the user features, the historical CTR features, the ID features, and the other numerical features. A novel CTR prediction approach is presented to address the imbalanced learning sample distribution by integrating the Weighted-ELM (WELM and the Adaboost algorithm. Compared to the commonly used algorithms, the proposed approach can improve the CTR significantly.

  13. Testing earthquake prediction algorithms: Statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992-1997

    Science.gov (United States)

    Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.

    1999-01-01

    Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier

  14. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    Science.gov (United States)

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  15. Capacitance Estimation for DC-link Capacitors in a Back-to-Back Converter Based on Artificial Neural Network Algorithm

    DEFF Research Database (Denmark)

    Soliman, Hammam Abdelaal Hammam; Wang, Huai; Blaabjerg, Frede

    2016-01-01

    of the aforementioned challenges and shortcomings. In this paper, a pure software condition monitoring method based on Artificial Neural Network (ANN) algorithm is proposed. The implemented ANN estimates the capacitance of the dc-link capacitor in a back-to-back converter. The error analysis of the estimated results......The reliability of dc-link capacitors in power electronic converters is one of the critical aspects to be considered in modern power converter design. The observation of their ageing process and the estimation of their health status have been an attractive subject for the industrial field and hence...

  16. Higher-spin cluster algorithms: the Heisenberg spin and U(1) quantum link models

    Energy Technology Data Exchange (ETDEWEB)

    Chudnovsky, V

    2000-03-01

    I discuss here how the highly-efficient spin-1/2 cluster algorithm for the Heisenberg antiferromagnet may be extended to higher-dimensional representations; some numerical results are provided. The same extensions can be used for the U(1) flux cluster algorithm, but have not yielded signals of the desired Coulomb phase of the system.

  17. Higher-spin cluster algorithms: the Heisenberg spin and U(1) quantum link models

    International Nuclear Information System (INIS)

    Chudnovsky, V.

    2000-01-01

    I discuss here how the highly-efficient spin-1/2 cluster algorithm for the Heisenberg antiferromagnet may be extended to higher-dimensional representations; some numerical results are provided. The same extensions can be used for the U(1) flux cluster algorithm, but have not yielded signals of the desired Coulomb phase of the system

  18. Investigation of energy management strategies for photovoltaic systems - A predictive control algorithm

    Science.gov (United States)

    Cull, R. C.; Eltimsahy, A. H.

    1983-01-01

    The present investigation is concerned with the formulation of energy management strategies for stand-alone photovoltaic (PV) systems, taking into account a basic control algorithm for a possible predictive, (and adaptive) controller. The control system controls the flow of energy in the system according to the amount of energy available, and predicts the appropriate control set-points based on the energy (insolation) available by using an appropriate system model. Aspects of adaptation to the conditions of the system are also considered. Attention is given to a statistical analysis technique, the analysis inputs, the analysis procedure, and details regarding the basic control algorithm.

  19. RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model.

    Science.gov (United States)

    Jabbari, Hosna; Wark, Ian; Montemagno, Carlo

    2018-01-01

    RNA is a biopolymer with various applications inside the cell and in biotechnology. Structure of an RNA molecule mainly determines its function and is essential to guide nanostructure design. Since experimental structure determination is time-consuming and expensive, accurate computational prediction of RNA structure is of great importance. Prediction of RNA secondary structure is relatively simpler than its tertiary structure and provides information about its tertiary structure, therefore, RNA secondary structure prediction has received attention in the past decades. Numerous methods with different folding approaches have been developed for RNA secondary structure prediction. While methods for prediction of RNA pseudoknot-free structure (structures with no crossing base pairs) have greatly improved in terms of their accuracy, methods for prediction of RNA pseudoknotted secondary structure (structures with crossing base pairs) still have room for improvement. A long-standing question for improving the prediction accuracy of RNA pseudoknotted secondary structure is whether to focus on the prediction algorithm or the underlying energy model, as there is a trade-off on computational cost of the prediction algorithm versus the generality of the method. The aim of this work is to argue when comparing different methods for RNA pseudoknotted structure prediction, the combination of algorithm and energy model should be considered and a method should not be considered superior or inferior to others if they do not use the same scoring model. We demonstrate that while the folding approach is important in structure prediction, it is not the only important factor in prediction accuracy of a given method as the underlying energy model is also as of great value. Therefore we encourage researchers to pay particular attention in comparing methods with different energy models.

  20. Accuracy of algorithms to predict accessory pathway location in children with Wolff-Parkinson-White syndrome.

    Science.gov (United States)

    Wren, Christopher; Vogel, Melanie; Lord, Stephen; Abrams, Dominic; Bourke, John; Rees, Philip; Rosenthal, Eric

    2012-02-01

    The aim of this study was to examine the accuracy in predicting pathway location in children with Wolff-Parkinson-White syndrome for each of seven published algorithms. ECGs from 100 consecutive children with Wolff-Parkinson-White syndrome undergoing electrophysiological study were analysed by six investigators using seven published algorithms, six of which had been developed in adult patients. Accuracy and concordance of predictions were adjusted for the number of pathway locations. Accessory pathways were left-sided in 49, septal in 20 and right-sided in 31 children. Overall accuracy of prediction was 30-49% for the exact location and 61-68% including adjacent locations. Concordance between investigators varied between 41% and 86%. No algorithm was better at predicting septal pathways (accuracy 5-35%, improving to 40-78% including adjacent locations), but one was significantly worse. Predictive accuracy was 24-53% for the exact location of right-sided pathways (50-71% including adjacent locations) and 32-55% for the exact location of left-sided pathways (58-73% including adjacent locations). All algorithms were less accurate in our hands than in other authors' own assessment. None performed well in identifying midseptal or right anteroseptal accessory pathway locations.

  1. Algorithm for predicting the evolution of series of dynamics of complex systems in solving information problems

    Science.gov (United States)

    Kasatkina, T. I.; Dushkin, A. V.; Pavlov, V. A.; Shatovkin, R. R.

    2018-03-01

    In the development of information, systems and programming to predict the series of dynamics, neural network methods have recently been applied. They are more flexible, in comparison with existing analogues and are capable of taking into account the nonlinearities of the series. In this paper, we propose a modified algorithm for predicting the series of dynamics, which includes a method for training neural networks, an approach to describing and presenting input data, based on the prediction by the multilayer perceptron method. To construct a neural network, the values of a series of dynamics at the extremum points and time values corresponding to them, formed based on the sliding window method, are used as input data. The proposed algorithm can act as an independent approach to predicting the series of dynamics, and be one of the parts of the forecasting system. The efficiency of predicting the evolution of the dynamics series for a short-term one-step and long-term multi-step forecast by the classical multilayer perceptron method and a modified algorithm using synthetic and real data is compared. The result of this modification was the minimization of the magnitude of the iterative error that arises from the previously predicted inputs to the inputs to the neural network, as well as the increase in the accuracy of the iterative prediction of the neural network.

  2. An Interference-Aware Traffic-Priority-Based Link Scheduling Algorithm for Interference Mitigation in Multiple Wireless Body Area Networks

    Directory of Open Access Journals (Sweden)

    Thien T. T. Le

    2016-12-01

    Full Text Available Currently, wireless body area networks (WBANs are effectively used for health monitoring services. However, in cases where WBANs are densely deployed, interference among WBANs can cause serious degradation of network performance and reliability. Inter-WBAN interference can be reduced by scheduling the communication links of interfering WBANs. In this paper, we propose an interference-aware traffic-priority-based link scheduling (ITLS algorithm to overcome inter-WBAN interference in densely deployed WBANs. First, we model a network with multiple WBANs as an interference graph where node-level interference and traffic priority are taken into account. Second, we formulate link scheduling for multiple WBANs as an optimization model where the objective is to maximize the throughput of the entire network while ensuring the traffic priority of sensor nodes. Finally, we propose the ITLS algorithm for multiple WBANs on the basis of the optimization model. High spatial reuse is also achieved in the proposed ITLS algorithm. The proposed ITLS achieves high spatial reuse while considering traffic priority, packet length, and the number of interfered sensor nodes. Our simulation results show that the proposed ITLS significantly increases spatial reuse and network throughput with lower delay by mitigating inter-WBAN interference.

  3. LP-LPA: A link influence-based label propagation algorithm for discovering community structures in networks

    Science.gov (United States)

    Berahmand, Kamal; Bouyer, Asgarali

    2018-03-01

    Community detection is an essential approach for analyzing the structural and functional properties of complex networks. Although many community detection algorithms have been recently presented, most of them are weak and limited in different ways. Label Propagation Algorithm (LPA) is a well-known and efficient community detection technique which is characterized by the merits of nearly-linear running time and easy implementation. However, LPA has some significant problems such as instability, randomness, and monster community detection. In this paper, an algorithm, namely node’s label influence policy for label propagation algorithm (LP-LPA) was proposed for detecting efficient community structures. LP-LPA measures link strength value for edges and nodes’ label influence value for nodes in a new label propagation strategy with preference on link strength and for initial nodes selection, avoid of random behavior in tiebreak states, and efficient updating order and rule update. These procedures can sort out the randomness issue in an original LPA and stabilize the discovered communities in all runs of the same network. Experiments on synthetic networks and a wide range of real-world social networks indicated that the proposed method achieves significant accuracy and high stability. Indeed, it can obviously solve monster community problem with regard to detecting communities in networks.

  4. The Inverse Contagion Problem (ICP) vs.. Predicting site contagion in real time, when network links are not observable

    Science.gov (United States)

    Mushkin, I.; Solomon, S.

    2017-10-01

    We study the inverse contagion problem (ICP). As opposed to the direct contagion problem, in which the network structure is known and the question is when each node will be contaminated, in the inverse problem the links of the network are unknown but a sequence of contagion histories (the times when each node was contaminated) is observed. We consider two versions of the ICP: The strong problem (SICP), which is the reconstruction of the network and has been studied before, and the weak problem (WICP), which requires "only" the prediction (at each time step) of the nodes that will be contaminated at the next time step (this is often the real life situation in which a contagion is observed and predictions are made in real time). Moreover, our focus is on analyzing the increasing accuracy of the solution, as a function of the number of contagion histories already observed. For simplicity, we discuss the simplest (deterministic and synchronous) contagion dynamics and the simplest solution algorithm, which we have applied to different network types. The main result of this paper is that the complex problem of the convergence of the ICP for a network can be reduced to an individual property of pairs of nodes: the "false link difficulty". By definition, given a pair of unlinked nodes i and j, the difficulty of the false link (i,j) is the probability that in a random contagion history, the nodes i and j are not contaminated at the same time step (or at consecutive time steps). In other words, the "false link difficulty" of a non-existing network link is the probability that the observations during a random contagion history would not rule out that link. This probability is relatively straightforward to calculate, and in most instances relies only on the relative positions of the two nodes (i,j) and not on the entire network structure. We have observed the distribution of false link difficulty for various network types, estimated it theoretically and confronted it

  5. Beam-column joint shear prediction using hybridized deep learning neural network with genetic algorithm

    Science.gov (United States)

    Mundher Yaseen, Zaher; Abdulmohsin Afan, Haitham; Tran, Minh-Tung

    2018-04-01

    Scientifically evidenced that beam-column joints are a critical point in the reinforced concrete (RC) structure under the fluctuation loads effects. In this novel hybrid data-intelligence model developed to predict the joint shear behavior of exterior beam-column structure frame. The hybrid data-intelligence model is called genetic algorithm integrated with deep learning neural network model (GA-DLNN). The genetic algorithm is used as prior modelling phase for the input approximation whereas the DLNN predictive model is used for the prediction phase. To demonstrate this structural problem, experimental data is collected from the literature that defined the dimensional and specimens’ properties. The attained findings evidenced the efficitveness of the hybrid GA-DLNN in modelling beam-column joint shear problem. In addition, the accurate prediction achived with less input variables owing to the feasibility of the evolutionary phase.

  6. Algorithms

    Indian Academy of Sciences (India)

    ticians but also forms the foundation of computer science. Two ... with methods of developing algorithms for solving a variety of problems but ... applications of computers in science and engineer- ... numerical calculus are as important. We will ...

  7. Validation Study of a Predictive Algorithm to Evaluate Opioid Use Disorder in a Primary Care Setting

    Science.gov (United States)

    Sharma, Maneesh; Lee, Chee; Kantorovich, Svetlana; Tedtaotao, Maria; Smith, Gregory A.

    2017-01-01

    Background: Opioid abuse in chronic pain patients is a major public health issue. Primary care providers are frequently the first to prescribe opioids to patients suffering from pain, yet do not always have the time or resources to adequately evaluate the risk of opioid use disorder (OUD). Purpose: This study seeks to determine the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm (“profile”) incorporating phenotypic and, more uniquely, genotypic risk factors. Methods and Results: In a validation study with 452 participants diagnosed with OUD and 1237 controls, the algorithm successfully categorized patients at high and moderate risk of OUD with 91.8% sensitivity. Regardless of changes in the prevalence of OUD, sensitivity of the algorithm remained >90%. Conclusion: The algorithm correctly stratifies primary care patients into low-, moderate-, and high-risk categories to appropriately identify patients in need for additional guidance, monitoring, or treatment changes. PMID:28890908

  8. Dynamically Predicting the Quality of Service: Batch, Online, and Hybrid Algorithms

    Directory of Open Access Journals (Sweden)

    Ya Chen

    2017-01-01

    Full Text Available This paper studies the problem of dynamically modeling the quality of web service. The philosophy of designing practical web service recommender systems is delivered in this paper. A general system architecture for such systems continuously collects the user-service invocation records and includes both an online training module and an offline training module for quality prediction. In addition, we introduce matrix factorization-based online and offline training algorithms based on the gradient descent algorithms and demonstrate the fitness of this online/offline algorithm framework to the proposed architecture. The superiority of the proposed model is confirmed by empirical studies on a real-life quality of web service data set and comparisons with existing web service recommendation algorithms.

  9. Crius: A Novel Fragment-Based Algorithm of De Novo Substrate Prediction for Enzymes.

    Science.gov (United States)

    Yao, Zhiqiang; Jiang, Shuiqin; Zhang, Lujia; Gao, Bei; He, Xiao; Zhang, John Z H; Wei, Dongzhi

    2018-05-03

    The study of enzyme substrate specificity is vital for developing potential applications of enzymes. However, the routine experimental procedures require lot of resources in the discovery of novel substrates. This article reports an in silico structure-based algorithm called Crius, which predicts substrates for enzyme. The results of this fragment-based algorithm show good agreements between the simulated and experimental substrate specificities, using a lipase from Candida antarctica (CALB), a nitrilase from Cyanobacterium syechocystis sp. PCC6803 (Nit6803), and an aldo-keto reductase from Gluconobacter oxydans (Gox0644). This opens new prospects of developing computer algorithms that can effectively predict substrates for an enzyme. This article is protected by copyright. All rights reserved. © 2018 The Protein Society.

  10. Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

    Science.gov (United States)

    Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

    2017-05-25

    Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.

  11. Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

    Science.gov (United States)

    Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

    2015-01-01

    Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (pmachine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273

  12. Predicting patchy particle crystals: variable box shape simulations and evolutionary algorithms

    NARCIS (Netherlands)

    Bianchi, E.; Doppelbauer, G.; Filion, L.C.; Dijkstra, M.; Kahl, G.

    2012-01-01

    We consider several patchy particle models that have been proposed in literature and we investigate their candidate crystal structures in a systematic way. We compare two different algorithms for predicting crystal structures: (i) an approach based on Monte Carlo simulations in the

  13. Analysis of longitudinal variations in North Pacific alkalinity to improve predictive algorithms

    Science.gov (United States)

    Fry, Claudia H.; Tyrrell, Toby; Achterberg, Eric P.

    2016-10-01

    The causes of natural variation in alkalinity in the North Pacific surface ocean need to be investigated to understand the carbon cycle and to improve predictive algorithms. We used GLODAPv2 to test hypotheses on the causes of three longitudinal phenomena in Alk*, a tracer of calcium carbonate cycling. These phenomena are (a) an increase from east to west between 45°N and 55°N, (b) an increase from west to east between 25°N and 40°N, and (c) a minor increase from west to east in the equatorial upwelling region. Between 45°N and 55°N, Alk* is higher on the western than on the eastern side, and this is associated with denser isopycnals with higher Alk* lying at shallower depths. Between 25°N and 40°N, upwelling along the North American continental shelf causes higher Alk* in the east. Along the equator, a strong east-west trend was not observed, even though the upwelling on the eastern side of the basin is more intense, because the water brought to the surface is not high in Alk*. We created two algorithms to predict alkalinity, one for the entire Pacific Ocean north of 30°S and one for the eastern margin. The Pacific Ocean algorithm is more accurate than the commonly used algorithm published by Lee et al. (2006), of similar accuracy to the best previously published algorithm by Sasse et al. (2013), and is less biased with longitude than other algorithms in the subpolar North Pacific. Our eastern margin algorithm is more accurate than previously published algorithms.

  14. Topology control algorithm for wireless sensor networks based on Link forwarding

    Science.gov (United States)

    Pucuo, Cairen; Qi, Ai-qin

    2018-03-01

    The research of topology control could effectively save energy and increase the service life of network based on wireless sensor. In this paper, a arithmetic called LTHC (link transmit hybrid clustering) based on link transmit is proposed. It decreases expenditure of energy by changing the way of cluster-node’s communication. The idea is to establish a link between cluster and SINK node when the cluster is formed, and link-node must be non-cluster. Through the link, cluster sends information to SINK nodes. For the sake of achieving the uniform distribution of energy on the network, prolongate the network survival time, and improve the purpose of communication, the communication will cut down much more expenditure of energy for cluster which away from SINK node. In the two aspects of improving the traffic and network survival time, we find that the LTCH is far superior to the traditional LEACH by experiments.

  15. A utility/cost analysis of breast cancer risk prediction algorithms

    Science.gov (United States)

    Abbey, Craig K.; Wu, Yirong; Burnside, Elizabeth S.; Wunderlich, Adam; Samuelson, Frank W.; Boone, John M.

    2016-03-01

    Breast cancer risk prediction algorithms are used to identify subpopulations that are at increased risk for developing breast cancer. They can be based on many different sources of data such as demographics, relatives with cancer, gene expression, and various phenotypic features such as breast density. Women who are identified as high risk may undergo a more extensive (and expensive) screening process that includes MRI or ultrasound imaging in addition to the standard full-field digital mammography (FFDM) exam. Given that there are many ways that risk prediction may be accomplished, it is of interest to evaluate them in terms of expected cost, which includes the costs of diagnostic outcomes. In this work we perform an expected-cost analysis of risk prediction algorithms that is based on a published model that includes the costs associated with diagnostic outcomes (true-positive, false-positive, etc.). We assume the existence of a standard screening method and an enhanced screening method with higher scan cost, higher sensitivity, and lower specificity. We then assess expected cost of using a risk prediction algorithm to determine who gets the enhanced screening method under the strong assumption that risk and diagnostic performance are independent. We find that if risk prediction leads to a high enough positive predictive value, it will be cost-effective regardless of the size of the subpopulation. Furthermore, in terms of the hit-rate and false-alarm rate of the of the risk prediction algorithm, iso-cost contours are lines with slope determined by properties of the available diagnostic systems for screening.

  16. Screw Remaining Life Prediction Based on Quantum Genetic Algorithm and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Xiaochen Zhang

    2017-01-01

    Full Text Available To predict the remaining life of ball screw, a screw remaining life prediction method based on quantum genetic algorithm (QGA and support vector machine (SVM is proposed. A screw accelerated test bench is introduced. Accelerometers are installed to monitor the performance degradation of ball screw. Combined with wavelet packet decomposition and isometric mapping (Isomap, the sensitive feature vectors are obtained and stored in database. Meanwhile, the sensitive feature vectors are randomly chosen from the database and constitute training samples and testing samples. Then the optimal kernel function parameter and penalty factor of SVM are searched with the method of QGA. Finally, the training samples are used to train optimized SVM while testing samples are adopted to test the prediction accuracy of the trained SVM so the screw remaining life prediction model can be got. The experiment results show that the screw remaining life prediction model could effectively predict screw remaining life.

  17. Comparison of the accuracy of three algorithms in predicting accessory pathways among adult Wolff-Parkinson-White syndrome patients.

    Science.gov (United States)

    Maden, Orhan; Balci, Kevser Gülcihan; Selcuk, Mehmet Timur; Balci, Mustafa Mücahit; Açar, Burak; Unal, Sefa; Kara, Meryem; Selcuk, Hatice

    2015-12-01

    The aim of this study was to investigate the accuracy of three algorithms in predicting accessory pathway locations in adult patients with Wolff-Parkinson-White syndrome in Turkish population. A total of 207 adult patients with Wolff-Parkinson-White syndrome were retrospectively analyzed. The most preexcited 12-lead electrocardiogram in sinus rhythm was used for analysis. Two investigators blinded to the patient data used three algorithms for prediction of accessory pathway location. Among all locations, 48.5% were left-sided, 44% were right-sided, and 7.5% were located in the midseptum or anteroseptum. When only exact locations were accepted as match, predictive accuracy for Chiang was 71.5%, 72.4% for d'Avila, and 71.5% for Arruda. The percentage of predictive accuracy of all algorithms did not differ between the algorithms (p = 1.000; p = 0.875; p = 0.885, respectively). The best algorithm for prediction of right-sided, left-sided, and anteroseptal and midseptal accessory pathways was Arruda (p algorithms were similar in predicting accessory pathway location and the predicted accuracy was lower than previously reported by their authors. However, according to the accessory pathway site, the algorithm designed by Arruda et al. showed better predictions than the other algorithms and using this algorithm may provide advantages before a planned ablation.

  18. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    Science.gov (United States)

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  19. Crystal structure prediction of flexible molecules using parallel genetic algorithms with a standard force field.

    Science.gov (United States)

    Kim, Seonah; Orendt, Anita M; Ferraro, Marta B; Facelli, Julio C

    2009-10-01

    This article describes the application of our distributed computing framework for crystal structure prediction (CSP) the modified genetic algorithms for crystal and cluster prediction (MGAC), to predict the crystal structure of flexible molecules using the general Amber force field (GAFF) and the CHARMM program. The MGAC distributed computing framework includes a series of tightly integrated computer programs for generating the molecule's force field, sampling crystal structures using a distributed parallel genetic algorithm and local energy minimization of the structures followed by the classifying, sorting, and archiving of the most relevant structures. Our results indicate that the method can consistently find the experimentally known crystal structures of flexible molecules, but the number of missing structures and poor ranking observed in some crystals show the need for further improvement of the potential. Copyright 2009 Wiley Periodicals, Inc.

  20. A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

    Science.gov (United States)

    Manning, Robert M.

    1990-01-01

    A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.

  1. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions.

    Science.gov (United States)

    Hao, Ming; Bryant, Stephen H; Wang, Yanli

    2018-02-06

    While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred. Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US.

  2. Computational investigation of kinetics of cross-linking reactions in proteins: importance in structure prediction.

    Science.gov (United States)

    Bandyopadhyay, Pradipta; Kuntz, Irwin D

    2009-01-01

    The determination of protein structure using distance constraints is a new and promising field of study. One implementation involves attaching residues of a protein using a cross-linking agent, followed by protease digestion, analysis of the resulting peptides by mass spectroscopy, and finally sequence threading to detect the protein folds. In the present work, we carry out computational modeling of the kinetics of cross-linking reactions in proteins using the master equation approach. The rate constants of the cross-linking reactions are estimated using the pKas and the solvent-accessible surface areas of the residues involved. This model is tested with fibroblast growth factor (FGF) and cytochrome C. It is consistent with the initial experimental rate data for individual lysine residues for cytochrome C. Our model captures all observed cross-links for FGF and almost 90% of the observed cross-links for cytochrome C, although it also predicts cross-links that were not observed experimentally (false positives). However, the analysis of the false positive results is complicated by the fact that experimental detection of cross-links can be difficult and may depend on specific experimental conditions such as pH, ionic strength. Receiver operator characteristic plots showed that our model does a good job in predicting the observed cross-links. Molecular dynamics simulations showed that for cytochrome C, in general, the two lysines come closer for the observed cross-links as compared to the false positive ones. For FGF, no such clear pattern exists. The kinetic model and MD simulation can be used to study proposed cross-linking protocols.

  3. Algorithms

    Indian Academy of Sciences (India)

    algorithm design technique called 'divide-and-conquer'. One of ... Turtle graphics, September. 1996. 5. ... whole list named 'PO' is a pointer to the first element of the list; ..... Program for computing matrices X and Y and placing the result in C *).

  4. Algorithms

    Indian Academy of Sciences (India)

    algorithm that it is implicitly understood that we know how to generate the next natural ..... Explicit comparisons are made in line (1) where maximum and minimum is ... It can be shown that the function T(n) = 3/2n -2 is the solution to the above ...

  5. Performance of local information-based link prediction: a sampling perspective

    Science.gov (United States)

    Zhao, Jichang; Feng, Xu; Dong, Li; Liang, Xiao; Xu, Ke

    2012-08-01

    Link prediction is pervasively employed to uncover the missing links in the snapshots of real-world networks, which are usually obtained through different kinds of sampling methods. In the previous literature, in order to evaluate the performance of the prediction, known edges in the sampled snapshot are divided into the training set and the probe set randomly, without considering the underlying sampling approaches. However, different sampling methods might lead to different missing links, especially for the biased ways. For this reason, random partition-based evaluation of performance is no longer convincing if we take the sampling method into account. In this paper, we try to re-evaluate the performance of local information-based link predictions through sampling method governed division of the training set and the probe set. It is interesting that we find that for different sampling methods, each prediction approach performs unevenly. Moreover, most of these predictions perform weakly when the sampling method is biased, which indicates that the performance of these methods might have been overestimated in the prior works.

  6. Assessing Long-Term Wind Conditions by Combining Different Measure-Correlate-Predict Algorithms: Preprint

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, J.; Chowdhury, S.; Messac, A.; Hodge, B. M.

    2013-08-01

    This paper significantly advances the hybrid measure-correlate-predict (MCP) methodology, enabling it to account for variations of both wind speed and direction. The advanced hybrid MCP method uses the recorded data of multiple reference stations to estimate the long-term wind condition at a target wind plant site. The results show that the accuracy of the hybrid MCP method is highly sensitive to the combination of the individual MCP algorithms and reference stations. It was also found that the best combination of MCP algorithms varies based on the length of the correlation period.

  7. Investigation of a breathing surrogate prediction algorithm for prospective pulmonary gating

    International Nuclear Information System (INIS)

    White, Benjamin M.; Low, Daniel A.; Zhao Tianyu; Wuenschel, Sara; Lu, Wei; Lamb, James M.; Mutic, Sasa; Bradley, Jeffrey D.; El Naqa, Issam

    2011-01-01

    Purpose: A major challenge of four dimensional computed tomography (4DCT) in treatment planning and delivery has been the lack of respiration amplitude and phase reproducibility during image acquisition. The implementation of a prospective gating algorithm would ensure that images would be acquired only during user-specified breathing phases. This study describes the development and testing of an autoregressive moving average (ARMA) model for human respiratory phase prediction under quiet respiration conditions. Methods: A total of 47 4DCT patient datasets and synchronized respiration records was utilized in this study. Three datasets were used in model development and were removed from further evaluation of the ARMA model. The remaining 44 patient datasets were evaluated with the ARMA model for prediction time steps from 50 to 1000 ms in increments of 50 and 100 ms. Thirty-five of these datasets were further used to provide a comparison between the proposed ARMA model and a commercial algorithm with a prediction time step of 240 ms. Results: The optimal number of parameters for the ARMA model was based on three datasets reserved for model development. Prediction error was found to increase as the prediction time step increased. The minimum prediction time step required for prospective gating was selected to be half of the gantry rotation period. The maximum prediction time step with a conservative 95% confidence criterion was found to be 0.3 s. The ARMA model predicted peak inhalation and peak exhalation phases significantly better than the commercial algorithm. Furthermore, the commercial algorithm had numerous instances of missed breath cycles and falsely predicted breath cycles, while the proposed model did not have these errors. Conclusions: An ARMA model has been successfully applied to predict human respiratory phase occurrence. For a typical CT scanner gantry rotation period of 0.4 s (0.2 s prediction time step), the absolute error was relatively small, 0

  8. WINROP algorithm for prediction of sight threatening retinopathy of prematurity: Initial experience in Indian preterm infants

    Directory of Open Access Journals (Sweden)

    Gaurav Sanghi

    2018-01-01

    Full Text Available Purpose: To determine the efficacy of the online monitoring tool, WINROP (https://winrop.com/ in detecting sight-threatening type 1 retinopathy of prematurity (ROP in Indian preterm infants. Methods: Birth weight, gestational age, and weekly weight measurements of seventy preterm infants (<32 weeks gestation born between June 2014 and August 2016 were entered into WINROP algorithm. Based on weekly weight gain, WINROP algorithm signaled an alarm to indicate that the infant is at risk for sight-threatening Type 1 ROP. ROP screening was done according to standard guidelines. The negative and positive predictive values were calculated using the sensitivity, specificity, and prevalence of ROP type 1 for the study group. 95% confidence interval (CI was calculated. Results: Of the seventy infants enrolled in the study, 31 (44.28% developed Type 1 ROP. WINROP alarm was signaled in 74.28% (52/70 of all infants and 90.32% (28/31 of infants treated for Type 1 ROP. The specificity was 38.46% (15/39. The positive predictive value was 53.84% (95% CI: 39.59–67.53 and negative predictive value was 83.3% (95% CI: 57.73–95.59. Conclusion: This is the first study from India using a weight gain-based algorithm for prediction of ROP. Overall sensitivity of WINROP algorithm in detecting Type 1 ROP was 90.32%. The overall specificity was 38.46%. Population-specific tweaking of algorithm may improve the result and practical utility for ophthalmologists and neonatologists.

  9. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    Directory of Open Access Journals (Sweden)

    Xiaoxia Yang

    Full Text Available Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  10. TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers

    Science.gov (United States)

    Cao, Han; Ng, Marcus C. K.; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W. I.

    2017-09-01

    α-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD PHP, MySQL and Apache, with all major browsers supported.

  11. A community effort to assess and improve drug sensitivity prediction algorithms.

    Science.gov (United States)

    Costello, James C; Heiser, Laura M; Georgii, Elisabeth; Gönen, Mehmet; Menden, Michael P; Wang, Nicholas J; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Collins, James J; Gallahan, Dan; Singer, Dinah; Saez-Rodriguez, Julio; Kaski, Samuel; Gray, Joe W; Stolovitzky, Gustavo

    2014-12-01

    Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodology, Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.

  12. Study on predicting residual life of elevator links by fracture mechanics approach

    Energy Technology Data Exchange (ETDEWEB)

    Li Helin; Zhang Yi; Deng Zengjie [China National Petroleum Corp., Xi`an, Shaanxi (China). Tubular Goods Research Center; Jin Dazeng [Xi`an Jiaotong Univ., Xi`an, Shaanxi (China)

    1995-12-31

    On the basis of investigation, failure and fracture analysis of elevator links, residual life prediction of links using fracture mechanics approach is studied, and mechanical properties, fracture toughness value K{sub IC} and fatigue crack propagation rage da/dN of the steel for elevator links are determined. Using the relation between stress intensity factor K{sub I} and the strain-energy release rate, the two-dimensional conversion thickness finite element method has been used to calculate the stress intensity factors K{sub I} for dangerous sections in the ring part of links. Furthermore, the reliability of calculations of the finite element stress intensity factors K{sub I} for dangerous sections of elevator links and the residual life computation for links are verified by fatigue tests of actual links. Finally, the experimental verification of computed results by 150T link fractured at site indicates that the computed critical crack lengths and residual life tally well with those measured and meet the needs of oil drilling.

  13. Research on prediction of agricultural machinery total power based on grey model optimized by genetic algorithm

    Science.gov (United States)

    Xie, Yan; Li, Mu; Zhou, Jin; Zheng, Chang-zheng

    2009-07-01

    Agricultural machinery total power is an important index to reflex and evaluate the level of agricultural mechanization. It is the power source of agricultural production, and is the main factors to enhance the comprehensive agricultural production capacity expand production scale and increase the income of the farmers. Its demand is affected by natural, economic, technological and social and other "grey" factors. Therefore, grey system theory can be used to analyze the development of agricultural machinery total power. A method based on genetic algorithm optimizing grey modeling process is introduced in this paper. This method makes full use of the advantages of the grey prediction model and characteristics of genetic algorithm to find global optimization. So the prediction model is more accurate. According to data from a province, the GM (1, 1) model for predicting agricultural machinery total power was given based on the grey system theories and genetic algorithm. The result indicates that the model can be used as agricultural machinery total power an effective tool for prediction.

  14. Small hydropower spot prediction using SWAT and a diversion algorithm, case study: Upper Citarum Basin

    Science.gov (United States)

    Kardhana, Hadi; Arya, Doni Khaira; Hadihardaja, Iwan K.; Widyaningtyas, Riawan, Edi; Lubis, Atika

    2017-11-01

    Small-Scale Hydropower (SHP) had been important electric energy power source in Indonesia. Indonesia is vast countries, consists of more than 17.000 islands. It has large fresh water resource about 3 m of rainfall and 2 m of runoff. Much of its topography is mountainous, remote but abundant with potential energy. Millions of people do not have sufficient access to electricity, some live in the remote places. Recently, SHP development was encouraged for energy supply of the places. Development of global hydrology data provides opportunity to predict distribution of hydropower potential. In this paper, we demonstrate run-of-river type SHP spot prediction tool using SWAT and a river diversion algorithm. The use of Soil and Water Assessment Tool (SWAT) with input of CFSR (Climate Forecast System Re-analysis) of 10 years period had been implemented to predict spatially distributed flow cumulative distribution function (CDF). A simple algorithm to maximize potential head of a location by a river diversion expressing head race and penstock had been applied. Firm flow and power of the SHP were estimated from the CDF and the algorithm. The tool applied to Upper Citarum River Basin and three out of four existing hydropower locations had been well predicted. The result implies that this tool is able to support acceleration of SHP development at earlier phase.

  15. Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III.

    Science.gov (United States)

    Boon, K H; Khalil-Hani, M; Malarvili, M B

    2018-01-01

    This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations

    Directory of Open Access Journals (Sweden)

    Faming Zhang

    2016-11-01

    Full Text Available The prediction of travel times is challenging because of the sparseness of real-time traffic data and the intrinsic uncertainty of travel on congested urban road networks. We propose a new gradient–boosted regression tree method to accurately predict travel times. This model accounts for spatiotemporal correlations extracted from historical and real-time traffic data for adjacent and target links. This method can deliver high prediction accuracy by combining simple regression trees with poor performance. It corrects the error found in existing models for improved prediction accuracy. Our spatiotemporal gradient–boosted regression tree model was verified in experiments. The training data were obtained from big data reflecting historic traffic conditions collected by probe vehicles in Wuhan from January to May 2014. Real-time data were extracted from 11 weeks of GPS records collected in Wuhan from 5 May 2014 to 20 July 2014. Based on these data, we predicted link travel time for the period from 21 July 2014 to 25 July 2014. Experiments showed that our proposed spatiotemporal gradient–boosted regression tree model obtained better results than gradient boosting, random forest, or autoregressive integrated moving average approaches. Furthermore, these results indicate the advantages of our model for urban link travel time prediction.

  17. Algorithms

    Indian Academy of Sciences (India)

    will become clear in the next article when we discuss a simple logo like programming language. ... Rod B may be used as an auxiliary store. The problem is to find an algorithm which performs this task. ... No disks are moved from A to Busing C as auxiliary rod. • move _disk (A, C);. (No + l)th disk is moved from A to C directly ...

  18. Real time implementation of a linear predictive coding algorithm on digital signal processor DSP32C

    International Nuclear Information System (INIS)

    Sheikh, N.M.; Usman, S.R.; Fatima, S.

    2002-01-01

    Pulse Code Modulation (PCM) has been widely used in speech coding. However, due to its high bit rate. PCM has severe limitations in application where high spectral efficiency is desired, for example, in mobile communication, CD quality broadcasting system etc. These limitation have motivated research in bit rate reduction techniques. Linear predictive coding (LPC) is one of the most powerful complex techniques for bit rate reduction. With the introduction of powerful digital signal processors (DSP) it is possible to implement the complex LPC algorithm in real time. In this paper we present a real time implementation of the LPC algorithm on AT and T's DSP32C at a sampling frequency of 8192 HZ. Application of the LPC algorithm on two speech signals is discussed. Using this implementation , a bit rate reduction of 1:3 is achieved for better than tool quality speech, while a reduction of 1.16 is possible for speech quality required in military applications. (author)

  19. The efficiency of the RULES-4 classification learning algorithm in predicting the density of agents

    Directory of Open Access Journals (Sweden)

    Ziad Salem

    2014-12-01

    Full Text Available Learning is the act of obtaining new or modifying existing knowledge, behaviours, skills or preferences. The ability to learn is found in humans, other organisms and some machines. Learning is always based on some sort of observations or data such as examples, direct experience or instruction. This paper presents a classification algorithm to learn the density of agents in an arena based on the measurements of six proximity sensors of a combined actuator sensor units (CASUs. Rules are presented that were induced by the learning algorithm that was trained with data-sets based on the CASU’s sensor data streams collected during a number of experiments with “Bristlebots (agents in the arena (environment”. It was found that a set of rules generated by the learning algorithm is able to predict the number of bristlebots in the arena based on the CASU’s sensor readings with satisfying accuracy.

  20. Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend

    Directory of Open Access Journals (Sweden)

    Montri Inthachot

    2016-01-01

    Full Text Available This study investigated the use of Artificial Neural Network (ANN and Genetic Algorithm (GA for prediction of Thailand’s SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid’s prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span.

  1. Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend

    Science.gov (United States)

    Boonjing, Veera; Intakosum, Sarun

    2016-01-01

    This study investigated the use of Artificial Neural Network (ANN) and Genetic Algorithm (GA) for prediction of Thailand's SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid's prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span. PMID:27974883

  2. A computational environment for long-term multi-feature and multi-algorithm seizure prediction.

    Science.gov (United States)

    Teixeira, C A; Direito, B; Costa, R P; Valderrama, M; Feldwisch-Drentrup, H; Nikolopoulos, S; Le Van Quyen, M; Schelter, B; Dourado, A

    2010-01-01

    The daily life of epilepsy patients is constrained by the possibility of occurrence of seizures. Until now, seizures cannot be predicted with sufficient sensitivity and specificity. Most of the seizure prediction studies have been focused on a small number of patients, and frequently assuming unrealistic hypothesis. This paper adopts the view that for an appropriate development of reliable predictors one should consider long-term recordings and several features and algorithms integrated in one software tool. A computational environment, based on Matlab (®), is presented, aiming to be an innovative tool for seizure prediction. It results from the need of a powerful and flexible tool for long-term EEG/ECG analysis by multiple features and algorithms. After being extracted, features can be subjected to several reduction and selection methods, and then used for prediction. The predictions can be conducted based on optimized thresholds or by applying computational intelligence methods. One important aspect is the integrated evaluation of the seizure prediction characteristic of the developed predictors.

  3. Biomine: predicting links between biological entities using network models of heterogeneous databases

    Directory of Open Access Journals (Sweden)

    Eronen Lauri

    2012-06-01

    Full Text Available Abstract Background Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Results Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. Conclusions The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable

  4. Analysis of energy-based algorithms for RNA secondary structure prediction

    Directory of Open Access Journals (Sweden)

    Hajiaghayi Monir

    2012-02-01

    Full Text Available Abstract Background RNA molecules play critical roles in the cells of organisms, including roles in gene regulation, catalysis, and synthesis of proteins. Since RNA function depends in large part on its folded structures, much effort has been invested in developing accurate methods for prediction of RNA secondary structure from the base sequence. Minimum free energy (MFE predictions are widely used, based on nearest neighbor thermodynamic parameters of Mathews, Turner et al. or those of Andronescu et al. Some recently proposed alternatives that leverage partition function calculations find the structure with maximum expected accuracy (MEA or pseudo-expected accuracy (pseudo-MEA methods. Advances in prediction methods are typically benchmarked using sensitivity, positive predictive value and their harmonic mean, namely F-measure, on datasets of known reference structures. Since such benchmarks document progress in improving accuracy of computational prediction methods, it is important to understand how measures of accuracy vary as a function of the reference datasets and whether advances in algorithms or thermodynamic parameters yield statistically significant improvements. Our work advances such understanding for the MFE and (pseudo-MEA-based methods, with respect to the latest datasets and energy parameters. Results We present three main findings. First, using the bootstrap percentile method, we show that the average F-measure accuracy of the MFE and (pseudo-MEA-based algorithms, as measured on our largest datasets with over 2000 RNAs from diverse families, is a reliable estimate (within a 2% range with high confidence of the accuracy of a population of RNA molecules represented by this set. However, average accuracy on smaller classes of RNAs such as a class of 89 Group I introns used previously in benchmarking algorithm accuracy is not reliable enough to draw meaningful conclusions about the relative merits of the MFE and MEA-based algorithms

  5. Efficient Algorithm for Computing Link-based Similarity in Real World Networks

    DEFF Research Database (Denmark)

    Cai, Yuanzhe; Cong, Gao; Xu, Jia

    2009-01-01

    Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiti...

  6. Retrieval algorithm for rainfall mapping from microwave links in a cellular communication network

    NARCIS (Netherlands)

    Overeem, Aart; Leijnse, Hidde; Uijlenhoet, Remko

    2016-01-01

    Microwave links in commercial cellular communication networks hold a promise for areal rainfall monitoring and could complement rainfall estimates from ground-based weather radars, rain gauges, and satellites. It has been shown that country-wide (≈ 35 500 km2) 15 min rainfall maps can

  7. Link Prediction Methods and Their Accuracy for Different Social Networks and Network Metrics

    Directory of Open Access Journals (Sweden)

    Fei Gao

    2015-01-01

    Full Text Available Currently, we are experiencing a rapid growth of the number of social-based online systems. The availability of the vast amounts of data gathered in those systems brings new challenges that we face when trying to analyse it. One of the intensively researched topics is the prediction of social connections between users. Although a lot of effort has been made to develop new prediction approaches, the existing methods are not comprehensively analysed. In this paper we investigate the correlation between network metrics and accuracy of different prediction methods. We selected six time-stamped real-world social networks and ten most widely used link prediction methods. The results of the experiments show that the performance of some methods has a strong correlation with certain network metrics. We managed to distinguish “prediction friendly” networks, for which most of the prediction methods give good performance, as well as “prediction unfriendly” networks, for which most of the methods result in high prediction error. Correlation analysis between network metrics and prediction accuracy of prediction methods may form the basis of a metalearning system where based on network characteristics it will be able to recommend the right prediction method for a given network.

  8. Multi-agent cooperation rescue algorithm based on influence degree and state prediction

    Science.gov (United States)

    Zheng, Yanbin; Ma, Guangfu; Wang, Linlin; Xi, Pengxue

    2018-04-01

    Aiming at the multi-agent cooperative rescue in disaster, a multi-agent cooperative rescue algorithm based on impact degree and state prediction is proposed. Firstly, based on the influence of the information in the scene on the collaborative task, the influence degree function is used to filter the information. Secondly, using the selected information to predict the state of the system and Agent behavior. Finally, according to the result of the forecast, the cooperative behavior of Agent is guided and improved the efficiency of individual collaboration. The simulation results show that this algorithm can effectively solve the cooperative rescue problem of multi-agent and ensure the efficient completion of the task.

  9. Training algorithms evaluation for artificial neural network to temporal prediction of photovoltaic generation

    International Nuclear Information System (INIS)

    Arantes Monteiro, Raul Vitor; Caixeta Guimarães, Geraldo; Rocio Castillo, Madeleine; Matheus Moura, Fabrício Augusto; Tamashiro, Márcio Augusto

    2016-01-01

    Current energy policies are encouraging the connection of power generation based on low-polluting technologies, mainly those using renewable sources, to distribution networks. Hence, it becomes increasingly important to understand technical challenges, facing high penetration of PV systems at the grid, especially considering the effects of intermittence of this source on the power quality, reliability and stability of the electric distribution system. This fact can affect the distribution networks on which they are attached causing overvoltage, undervoltage and frequency oscillations. In order to predict these disturbs, artificial neural networks are used. This article aims to analyze 3 training algorithms used in artificial neural networks for temporal prediction of the generated active power thru photovoltaic panels. As a result it was concluded that the algorithm with the best performance among the 3 analyzed was the Levenberg-Marquadrt.

  10. Artificial Fish Swarm Algorithm-Based Particle Filter for Li-Ion Battery Life Prediction

    Directory of Open Access Journals (Sweden)

    Ye Tian

    2014-01-01

    Full Text Available An intelligent online prognostic approach is proposed for predicting the remaining useful life (RUL of lithium-ion (Li-ion batteries based on artificial fish swarm algorithm (AFSA and particle filter (PF, which is an integrated approach combining model-based method with data-driven method. The parameters, used in the empirical model which is based on the capacity fade trends of Li-ion batteries, are identified dependent on the tracking ability of PF. AFSA-PF aims to improve the performance of the basic PF. By driving the prior particles to the domain with high likelihood, AFSA-PF allows global optimization, prevents particle degeneracy, thereby improving particle distribution and increasing prediction accuracy and algorithm convergence. Data provided by NASA are used to verify this approach and compare it with basic PF and regularized PF. AFSA-PF is shown to be more accurate and precise.

  11. HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

    Directory of Open Access Journals (Sweden)

    Xiaomin Wang

    2011-01-01

    Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.

  12. Performance prediction of a synchronization link for distributed aerospace wireless systems.

    Science.gov (United States)

    Wang, Wen-Qin; Shao, Huaizong

    2013-01-01

    For reasons of stealth and other operational advantages, distributed aerospace wireless systems have received much attention in recent years. In a distributed aerospace wireless system, since the transmitter and receiver placed on separated platforms which use independent master oscillators, there is no cancellation of low-frequency phase noise as in the monostatic cases. Thus, high accurate time and frequency synchronization techniques are required for distributed wireless systems. The use of a dedicated synchronization link to quantify and compensate oscillator frequency instability is investigated in this paper. With the mathematical statistical models of phase noise, closed-form analytic expressions for the synchronization link performance are derived. The possible error contributions including oscillator, phase-locked loop, and receiver noise are quantified. The link synchronization performance is predicted by utilizing the knowledge of the statistical models, system error contributions, and sampling considerations. Simulation results show that effective synchronization error compensation can be achieved by using this dedicated synchronization link.

  13. Prediction of Tibial Rotation Pathologies Using Particle Swarm Optimization and K-Means Algorithms.

    Science.gov (United States)

    Sari, Murat; Tuna, Can; Akogul, Serkan

    2018-03-28

    The aim of this article is to investigate pathological subjects from a population through different physical factors. To achieve this, particle swarm optimization (PSO) and K-means (KM) clustering algorithms have been combined (PSO-KM). Datasets provided by the literature were divided into three clusters based on age and weight parameters and each one of right tibial external rotation (RTER), right tibial internal rotation (RTIR), left tibial external rotation (LTER), and left tibial internal rotation (LTIR) values were divided into three types as Type 1, Type 2 and Type 3 (Type 2 is non-pathological (normal) and the other two types are pathological (abnormal)), respectively. The rotation values of every subject in any cluster were noted. Then the algorithm was run and the produced values were also considered. The values of the produced algorithm, the PSO-KM, have been compared with the real values. The hybrid PSO-KM algorithm has been very successful on the optimal clustering of the tibial rotation types through the physical criteria. In this investigation, Type 2 (pathological subjects) is of especially high predictability and the PSO-KM algorithm has been very successful as an operation system for clustering and optimizing the tibial motion data assessments. These research findings are expected to be very useful for health providers, such as physiotherapists, orthopedists, and so on, in which this consequence may help clinicians to appropriately designing proper treatment schedules for patients.

  14. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  15. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    Science.gov (United States)

    Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  16. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    Energy Technology Data Exchange (ETDEWEB)

    Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M. [Applied Electromagnetic Research Institute, National Institute of Information and Communications Technology, 4-2-1, Nukui-Kitamachi, Koganei, Tokyo 184-8795 (Japan); Sugiura, K., E-mail: nishizuka.naoto@nict.go.jp [Advanced Speech Translation Research and Development Promotion Center, National Institute of Information and Communications Technology (Japan)

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  17. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    International Nuclear Information System (INIS)

    Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.; Sugiura, K.

    2017-01-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  18. Predictive Control of Hydronic Floor Heating Systems using Neural Networks and Genetic Algorithms

    DEFF Research Database (Denmark)

    Vinther, Kasper; Green, Torben; Østergaard, Søren

    2017-01-01

    This paper presents the use a neural network and a micro genetic algorithm to optimize future set-points in existing hydronic floor heating systems for improved energy efficiency. The neural network can be trained to predict the impact of changes in set-points on future room temperatures. Additio...... space is not guaranteed. Evaluation of the performance of multiple neural networks is performed, using different levels of information, and optimization results are presented on a detailed house simulation model....

  19. On the best learning algorithm for web services response time prediction

    DEFF Research Database (Denmark)

    Madsen, Henrik; Albu, Razvan-Daniel; Popentiu-Vladicescu, Florin

    2013-01-01

    In this article we will examine the effect of different learning algorithms, while training the MLP (Multilayer Perceptron) with the intention of predicting web services response time. Web services do not necessitate a user interface. This may seem contradictory to most people's concept of what...... an application is. A Web service is better imagined as an application "segment," or better as a program enabler. Performance is an important quality aspect of Web services because of their distributed nature. Predicting the response of web services during their operation is very important....

  20. The fatigue life prediction of aluminium alloy using genetic algorithm and neural network

    Science.gov (United States)

    Susmikanti, Mike

    2013-09-01

    The behavior of the fatigue life of the industrial materials is very important. In many cases, the material with experiencing fatigue life cannot be avoided, however, there are many ways to control their behavior. Many investigations of the fatigue life phenomena of alloys have been done, but it is high cost and times consuming computation. This paper report the modeling and simulation approaches to predict the fatigue life behavior of Aluminum Alloys and resolves some problems of computation. First, the simulation using genetic algorithm was utilized to optimize the load to obtain the stress values. These results can be used to provide N-cycle fatigue life of the material. Furthermore, the experimental data was applied as input data in the neural network learning, while the samples data were applied for testing of the training data. Finally, the multilayer perceptron algorithm is applied to predict whether the given data sets in accordance with the fatigue life of the alloy. To achieve rapid convergence, the Levenberg-Marquardt algorithm was also employed. The simulations results shows that the fatigue behaviors of aluminum under pressure can be predicted. In addition, implementation of neural networks successfully identified a model for material fatigue life.

  1. Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.

    Science.gov (United States)

    Huang, Cai; Mezencev, Roman; McDonald, John F; Vannberg, Fredrik

    2017-01-01

    Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM) algorithm combined with a standard recursive feature elimination (RFE) approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60). The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC) patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.

  2. Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.

    Directory of Open Access Journals (Sweden)

    Cai Huang

    Full Text Available Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM algorithm combined with a standard recursive feature elimination (RFE approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60. The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.

  3. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

    Science.gov (United States)

    Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

    2018-04-11

    Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions

  4. A New Tool for CME Arrival Time Prediction using Machine Learning Algorithms: CAT-PUMA

    Science.gov (United States)

    Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert

    2018-03-01

    Coronal mass ejections (CMEs) are arguably the most violent eruptions in the solar system. CMEs can cause severe disturbances in interplanetary space and can even affect human activities in many aspects, causing damage to infrastructure and loss of revenue. Fast and accurate prediction of CME arrival time is vital to minimize the disruption that CMEs may cause when interacting with geospace. In this paper, we propose a new approach for partial-/full halo CME Arrival Time Prediction Using Machine learning Algorithms (CAT-PUMA). Via detailed analysis of the CME features and solar-wind parameters, we build a prediction engine taking advantage of 182 previously observed geo-effective partial-/full halo CMEs and using algorithms of the Support Vector Machine. We demonstrate that CAT-PUMA is accurate and fast. In particular, predictions made after applying CAT-PUMA to a test set unknown to the engine show a mean absolute prediction error of ∼5.9 hr within the CME arrival time, with 54% of the predictions having absolute errors less than 5.9 hr. Comparisons with other models reveal that CAT-PUMA has a more accurate prediction for 77% of the events investigated that can be carried out very quickly, i.e., within minutes of providing the necessary input parameters of a CME. A practical guide containing the CAT-PUMA engine and the source code of two examples are available in the Appendix, allowing the community to perform their own applications for prediction using CAT-PUMA.

  5. Parametric study on the advantages of weather-predicted control algorithm of free cooling ventilation system

    International Nuclear Information System (INIS)

    Medved, Sašo; Babnik, Miha; Vidrih, Boris; Arkar, Ciril

    2014-01-01

    Predicted climate changes and the increased intensity of urban heat islands, as well as population aging, will increase the energy demand for the cooling of buildings in the future. However, the energy demand for cooling can be efficiently reduced by low-exergy free-cooling systems, which use natural processes, like evaporative cooling or the environmental cold of ambient air during night-time ventilation for the cooling of buildings. Unlike mechanical cooling systems, the energy for the operation of free-cooling system is needed only for the transport of the cold from the environment into the building. Because the natural cold potential is time dependent, the efficiency of free-cooling systems could be improved by introducing a weather forecast into the algorithm for the controlling. In the article, a numerical algorithm for the optimization of the operation of free-cooling systems with night-time ventilation is presented and validated on a test cell with different thermal storage capacities and during different ambient conditions. As a case study, the advantage of weather-predicted controlling is presented for a summer week for typical office room. The results show the necessity of the weather-predicted controlling of free-cooling ventilation systems for achieving the highest overall energy efficiency of such systems in comparison to mechanical cooling, better indoor comfort conditions and a decrease in the primary energy needed for cooling of the buildings. - Highlights: • Energy demand for cooling will increase due to climate changes and urban heat island • Free cooling could significantly reduce energy demand for cooling of the buildings. • Free cooling is more effective if weather prediction is included in operation control. • Weather predicted free cooling operation algorithm was validated on test cell. • Advantages of free-cooling on mechanical cooling is shown with different indicators

  6. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  7. Application of Genetic Algorithm to Predict Optimal Sowing Region and Timing for Kentucky Bluegrass in China.

    Directory of Open Access Journals (Sweden)

    Erxu Pi

    Full Text Available Temperature is a predominant environmental factor affecting grass germination and distribution. Various thermal-germination models for prediction of grass seed germination have been reported, in which the relationship between temperature and germination were defined with kernel functions, such as quadratic or quintic function. However, their prediction accuracies warrant further improvements. The purpose of this study is to evaluate the relative prediction accuracies of genetic algorithm (GA models, which are automatically parameterized with observed germination data. The seeds of five P. pratensis (Kentucky bluegrass, KB cultivars were germinated under 36 day/night temperature regimes ranging from 5/5 to 40/40 °C with 5 °C increments. Results showed that optimal germination percentages of all five tested KB cultivars were observed under a fluctuating temperature regime of 20/25 °C. Meanwhile, the constant temperature regimes (e.g., 5/5, 10/10, 15/15 °C, etc. suppressed the germination of all five cultivars. Furthermore, the back propagation artificial neural network (BP-ANN algorithm was integrated to optimize temperature-germination response models from these observed germination data. It was found that integrations of GA-BP-ANN (back propagation aided genetic algorithm artificial neural network significantly reduced the Root Mean Square Error (RMSE values from 0.21~0.23 to 0.02~0.09. In an effort to provide a more reliable prediction of optimum sowing time for the tested KB cultivars in various regions in the country, the optimized GA-BP-ANN models were applied to map spatial and temporal germination percentages of blue grass cultivars in China. Our results demonstrate that the GA-BP-ANN model is a convenient and reliable option for constructing thermal-germination response models since it automates model parameterization and has excellent prediction accuracy.

  8. Prediction of insemination outcomes in Holstein dairy cattle using alternative machine learning algorithms.

    Science.gov (United States)

    Shahinfar, Saleh; Page, David; Guenther, Jerry; Cabrera, Victor; Fricke, Paul; Weigel, Kent

    2014-02-01

    When making the decision about whether or not to breed a given cow, knowledge about the expected outcome would have an economic impact on profitability of the breeding program and net income of the farm. The outcome of each breeding can be affected by many management and physiological features that vary between farms and interact with each other. Hence, the ability of machine learning algorithms to accommodate complex relationships in the data and missing values for explanatory variables makes these algorithms well suited for investigation of reproduction performance in dairy cattle. The objective of this study was to develop a user-friendly and intuitive on-farm tool to help farmers make reproduction management decisions. Several different machine learning algorithms were applied to predict the insemination outcomes of individual cows based on phenotypic and genotypic data. Data from 26 dairy farms in the Alta Genetics (Watertown, WI) Advantage Progeny Testing Program were used, representing a 10-yr period from 2000 to 2010. Health, reproduction, and production data were extracted from on-farm dairy management software, and estimated breeding values were downloaded from the US Department of Agriculture Agricultural Research Service Animal Improvement Programs Laboratory (Beltsville, MD) database. The edited data set consisted of 129,245 breeding records from primiparous Holstein cows and 195,128 breeding records from multiparous Holstein cows. Each data point in the final data set included 23 and 25 explanatory variables and 1 binary outcome for of 0.756 ± 0.005 and 0.736 ± 0.005 for primiparous and multiparous cows, respectively. The naïve Bayes algorithm, Bayesian network, and decision tree algorithms showed somewhat poorer classification performance. An information-based variable selection procedure identified herd average conception rate, incidence of ketosis, number of previous (failed) inseminations, days in milk at breeding, and mastitis as the most

  9. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    Directory of Open Access Journals (Sweden)

    Xiao-Lin Wu

    Full Text Available Low-density (LD single nucleotide polymorphism (SNP arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD or high-density (HD SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE or haplotype-averaged Shannon entropy (HASE and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus

  10. A novel gene network inference algorithm using predictive minimum description length approach.

    Science.gov (United States)

    Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

    2010-05-28

    Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the

  11. Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm

    Science.gov (United States)

    Öhrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

    2010-01-01

    One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes. PMID:20864443

  12. Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

    Science.gov (United States)

    Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

    2018-01-01

    T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95

  13. Choosing algorithms for TB screening: a modelling study to compare yield, predictive value and diagnostic burden.

    Science.gov (United States)

    Van't Hoog, Anna H; Onozaki, Ikushi; Lonnroth, Knut

    2014-10-19

    To inform the choice of an appropriate screening and diagnostic algorithm for tuberculosis (TB) screening initiatives in different epidemiological settings, we compare algorithms composed of currently available methods. Of twelve algorithms composed of screening for symptoms (prolonged cough or any TB symptom) and/or chest radiography abnormalities, and either sputum-smear microscopy (SSM) or Xpert MTB/RIF (XP) as confirmatory test we model algorithm outcomes and summarize the yield, number needed to screen (NNS) and positive predictive value (PPV) for different levels of TB prevalence. Screening for prolonged cough has low yield, 22% if confirmatory testing is by SSM and 32% if XP, and a high NNS, exceeding 1000 if TB prevalence is ≤0.5%. Due to low specificity the PPV of screening for any TB symptom followed by SSM is less than 50%, even if TB prevalence is 2%. CXR screening for TB abnormalities followed by XP has the highest case detection (87%) and lowest NNS, but is resource intensive. CXR as a second screen for symptom screen positives improves efficiency. The ideal algorithm does not exist. The choice will be setting specific, for which this study provides guidance. Generally an algorithm composed of CXR screening followed by confirmatory testing with XP can achieve the lowest NNS and highest PPV, and is the least amenable to setting-specific variation. However resource requirements for tests and equipment may be prohibitive in some settings and a reason to opt for symptom screening and SSM. To better inform disease control programs we need empirical data to confirm the modeled yield, cost-effectiveness studies, transmission models and a better screening test.

  14. Impact of Nonlinear Power Amplifier on Link Adaptation Algorithm of OFDM Systems

    DEFF Research Database (Denmark)

    Das, Suvra S.; Tariq, Faisal; Rahman, Muhammad Imadur

    2007-01-01

    The impact of non linear distortion due to High Power Amplifier (HPA) on the performance of Link Adaptation (LA) - Orthogonal Frequency Division Multiplexing (OFDM) based wireless system is analyzed. The performance of both Forward Error Control Coding (FEC) en-coded and uncoded system is evaluated....... LA maximizes the throughput while maintaining a required Block Error Rate (BLER). It is found that when OFDM signal, which has high PAPR, suffers non linear distortion due to non ideal HPA, the LA fails to meet the target BLER. Detailed analysis of the distortion and effects on LA are presented...

  15. NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Jin Changjiang

    2006-10-01

    Full Text Available Abstract Background Protein palmitoylation, an essential and reversible post-translational modification (PTM, has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. Results In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM, also have been employed and compared with NBA-Palm. Conclusion Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: http://www.bioinfo.tsinghua.edu.cn/NBA-Palm.

  16. NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm.

    Science.gov (United States)

    Xue, Yu; Chen, Hu; Jin, Changjiang; Sun, Zhirong; Yao, Xuebiao

    2006-10-17

    Protein palmitoylation, an essential and reversible post-translational modification (PTM), has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed) and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM), also have been employed and compared with NBA-Palm. Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: http://www.bioinfo.tsinghua.edu.cn/NBA-Palm.

  17. A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.

    Directory of Open Access Journals (Sweden)

    Daniel M de Brito

    Full Text Available Genomic Islands (GIs are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me.

  18. [Prediction of regional soil quality based on mutual information theory integrated with decision tree algorithm].

    Science.gov (United States)

    Lin, Fen-Fang; Wang, Ke; Yang, Ning; Yan, Shi-Guang; Zheng, Xin-Yu

    2012-02-01

    In this paper, some main factors such as soil type, land use pattern, lithology type, topography, road, and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality, mutual information theory was adopted to select the main environmental factors, and decision tree algorithm See 5.0 was applied to predict the grade of regional soil quality. The main factors affecting regional soil quality were soil type, land use, lithology type, distance to town, distance to water area, altitude, distance to road, and distance to industrial land. The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables, and, for the former model, whether of decision tree or of decision rule, its prediction accuracy was all higher than 80%. Based on the continuous and categorical data, the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm, but also predict and assess regional soil quality effectively.

  19. A novel constrained H2 optimization algorithm for mechatronics design in flexure-linked biaxial gantry.

    Science.gov (United States)

    Ma, Jun; Chen, Si-Lu; Kamaldin, Nazir; Teo, Chek Sing; Tay, Arthur; Mamun, Abdullah Al; Tan, Kok Kiong

    2017-11-01

    The biaxial gantry is widely used in many industrial processes that require high precision Cartesian motion. The conventional rigid-link version suffers from breaking down of joints if any de-synchronization between the two carriages occurs. To prevent above potential risk, a flexure-linked biaxial gantry is designed to allow a small rotation angle of the cross-arm. Nevertheless, the chattering of control signals and inappropriate design of the flexure joint will possibly induce resonant modes of the end-effector. Thus, in this work, the design requirements in terms of tracking accuracy, biaxial synchronization, and resonant mode suppression are achieved by integrated optimization of the stiffness of flexures and PID controller parameters for a class of point-to-point reference trajectories with same dynamics but different steps. From here, an H 2 optimization problem with defined constraints is formulated, and an efficient iterative solver is proposed by hybridizing direct computation of constrained projection gradient and line search of optimal step. Comparative experimental results obtained on the testbed are presented to verify the effectiveness of the proposed method. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  20. Predicting the Occurrence of Haze Events in Southeast Asia using Machine Learning Algorithms

    Science.gov (United States)

    Lee, H. H.; Chulakadabba, A.; Tonks, A.; Yang, Z.; Wang, C.

    2017-12-01

    Severe local- and regional-scale air pollution episodes typically originate from 1) high emissions of air pollutants, 2) poor dispersion conditions, and 3) trans-boundary pollutant transport. Biomass burning activities have become more frequent in Southeast Asia, especially in Sumatra, Borneo, and the mainland Southeast. Trans-boundary transport of biomass burning aerosols often lead to air quality problems in the region. Furthermore, particulate pollutants from human activities besides biomass burning also play an important role in the air quality of Southeast Asia. Singapore, for example, has a dynamic industrial sector including chemical, electric and metallurgic industries, and is the region's major petroleum-refining center. In addition, natural gas and oil power plants, waste incinerators, active port traffic, and a major regional airport further complicate Singapore's air quality issues. In this study, we compare five Machine Learning algorithms: k-Nearest Neighbors, Linear Support Vector Machine, Decision Tree, Random Forest and Artificial Neural Network, to identify haze patterns and determine variable importance. The algorithms were trained using local atmospheric data (i.e. months, atmospheric conditions, wind direction and relative humidity) from three observation stations in Singapore (Changi, Seletar and Paya Labar). We find that the algorithms reveal the associations in data within and between the stations, and provide in-depth interpretation of the haze sources. The algorithms also allow us to predict the probability of haze episodes in Singapore and to determine the correlation between this probability and atmospheric conditions.

  1. A new spirometry-based algorithm to predict occupational pulmonary restrictive impairment.

    Science.gov (United States)

    De Matteis, S; Iridoy-Zulet, A A; Aaron, S; Swann, A; Cullinan, P

    2016-01-01

    Spirometry is often included in workplace-based respiratory surveillance programmes but its performance in the identification of restrictive lung disease is poor, especially when the prevalence of this condition is low in the tested population. To improve the specificity (Sp) and positive predictive value (PPV) of current spirometry-based algorithms in the diagnosis of restrictive pulmonary impairment in the workplace and to reduce the proportion of false positives findings and, as a result, unnecessary referrals for lung volume measurements. We re-analysed two studies of hospital patients, respectively used to derive and validate a recommended spirometry-based algorithm [forced vital capacity (FVC) 55%] for the recognition of restrictive pulmonary impairment. We used true lung restrictive cases as a reference standard in 2×2 contingency tables to estimate sensitivity (Sn), Sp and PPV and negative predictive values for each diagnostic cut-off. We simulated a working population aged spirometry-based algorithm may be adopted to accurately exclude pulmonary restriction and to possibly reduce unnecessary lung volume testing in an occupational health setting. © The Author 2015. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  2. Advanced Emergency Braking Control Based on a Nonlinear Model Predictive Algorithm for Intelligent Vehicles

    Directory of Open Access Journals (Sweden)

    Ronghui Zhang

    2017-05-01

    Full Text Available Focusing on safety, comfort and with an overall aim of the comprehensive improvement of a vision-based intelligent vehicle, a novel Advanced Emergency Braking System (AEBS is proposed based on Nonlinear Model Predictive Algorithm. Considering the nonlinearities of vehicle dynamics, a vision-based longitudinal vehicle dynamics model is established. On account of the nonlinear coupling characteristics of the driver, surroundings, and vehicle itself, a hierarchical control structure is proposed to decouple and coordinate the system. To avoid or reduce the collision risk between the intelligent vehicle and collision objects, a coordinated cost function of tracking safety, comfort, and fuel economy is formulated. Based on the terminal constraints of stable tracking, a multi-objective optimization controller is proposed using the theory of non-linear model predictive control. To quickly and precisely track control target in a finite time, an electronic brake controller for AEBS is designed based on the Nonsingular Fast Terminal Sliding Mode (NFTSM control theory. To validate the performance and advantages of the proposed algorithm, simulations are implemented. According to the simulation results, the proposed algorithm has better integrated performance in reducing the collision risk and improving the driving comfort and fuel economy of the smart car compared with the existing single AEBS.

  3. Evaluation of machine learning algorithms for prediction of regions of high Reynolds averaged Navier Stokes uncertainty

    Science.gov (United States)

    Ling, J.; Templeton, J.

    2015-08-01

    Reynolds Averaged Navier Stokes (RANS) models are widely used in industry to predict fluid flows, despite their acknowledged deficiencies. Not only do RANS models often produce inaccurate flow predictions, but there are very limited diagnostics available to assess RANS accuracy for a given flow configuration. If experimental or higher fidelity simulation results are not available for RANS validation, there is no reliable method to evaluate RANS accuracy. This paper explores the potential of utilizing machine learning algorithms to identify regions of high RANS uncertainty. Three different machine learning algorithms were evaluated: support vector machines, Adaboost decision trees, and random forests. The algorithms were trained on a database of canonical flow configurations for which validated direct numerical simulation or large eddy simulation results were available, and were used to classify RANS results on a point-by-point basis as having either high or low uncertainty, based on the breakdown of specific RANS modeling assumptions. Classifiers were developed for three different basic RANS eddy viscosity model assumptions: the isotropy of the eddy viscosity, the linearity of the Boussinesq hypothesis, and the non-negativity of the eddy viscosity. It is shown that these classifiers are able to generalize to flows substantially different from those on which they were trained. Feature selection techniques, model evaluation, and extrapolation detection are discussed in the context of turbulence modeling applications.

  4. Monitoring of the future strong Vrancea events by using the CN formal earthquake prediction algorithm

    International Nuclear Information System (INIS)

    Moldoveanu, C.L.; Novikova, O.V.; Panza, G.F.; Radulian, M.

    2003-06-01

    The preparation process of the strong subcrustal events originating in Vrancea region, Romania, is monitored using an intermediate-term medium-range earthquake prediction method - the CN algorithm (Keilis-Borok and Rotwain, 1990). We present the results of the monitoring of the preparation of future strong earthquakes for the time interval from January 1, 1994 (1994.1.1), to January 1, 2003 (2003.1.1) using the updated catalogue of the Romanian local network. The database considered for the CN monitoring of the preparation of future strong earthquakes in Vrancea covers the period from 1966.3.1 to 2003.1.1 and the geographical rectangle 44.8 deg - 48.4 deg N, 25.0 deg - 28.0 deg E. The algorithm correctly identifies, by retrospective prediction, the TJPs for all the three strong earthquakes (Mo=6.4) that occurred in Vrancea during this period. The cumulated duration of the TIPs represents 26.5% of the total period of time considered (1966.3.1-2003.1.1). The monitoring of current seismicity using the algorithm CN has been carried out since 1994. No strong earthquakes occurred from 1994.1.1 to 2003.1.1 but the CN declared an extended false alarm from 1999.5.1 to 2000.11.1. No alarm has currently been declared in the region (on January 1, 2003), as can be seen from the TJPs diagram shown. (author)

  5. Slow Learner Prediction Using Multi-Variate Naïve Bayes Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Shiwani Rana

    2017-01-01

    Full Text Available Machine Learning is a field of computer science that learns from data by studying algorithms and their constructions. In machine learning, for specific inputs, algorithms help to make predictions. Classification is a supervised learning approach, which maps a data item into predefined classes. For predicting slow learners in an institute, a modified Naïve Bayes algorithm implemented. The implementation is carried sing Python.  It takes into account a combination of likewise multi-valued attributes. A dataset of the 60 students of BE (Information Technology Third Semester for the subject of Digital Electronics of University Institute of Engineering and Technology (UIET, Panjab University (PU, Chandigarh, India is taken to carry out the simulations. The analysis is done by choosing most significant forty-eight attributes. The experimental results have shown that the modified Naïve Bayes model has outperformed the Naïve Bayes Classifier in accuracy but requires significant improvement in the terms of elapsed time. By using Modified Naïve Bayes approach, the accuracy is found out to be 71.66% whereas it is calculated 66.66% using existing Naïve Bayes model. Further, a comparison is drawn by using WEKA tool. Here, an accuracy of Naïve Bayes is obtained as 58.33 %.

  6. An Automated Defect Prediction Framework using Genetic Algorithms: A Validation of Empirical Studies

    Directory of Open Access Journals (Sweden)

    Juan Murillo-Morera

    2016-05-01

    Full Text Available Today, it is common for software projects to collect measurement data through development processes. With these data, defect prediction software can try to estimate the defect proneness of a software module, with the objective of assisting and guiding software practitioners. With timely and accurate defect predictions, practitioners can focus their limited testing resources on higher risk areas. This paper reports the results of three empirical studies that uses an automated genetic defect prediction framework. This framework generates and compares different learning schemes (preprocessing + attribute selection + learning algorithms and selects the best one using a genetic algorithm, with the objective to estimate the defect proneness of a software module. The first empirical study is a performance comparison of our framework with the most important framework of the literature. The second empirical study is a performance and runtime comparison between our framework and an exhaustive framework. The third empirical study is a sensitivity analysis. The last empirical study, is our main contribution in this paper. Performance of the software development defect prediction models (using AUC, Area Under the Curve was validated using NASA-MDP and PROMISE data sets. Seventeen data sets from NASA-MDP (13 and PROMISE (4 projects were analyzed running a NxM-fold cross-validation. A genetic algorithm was used to select the components of the learning schemes automatically, and to assess and report the results. Our results reported similar performance between frameworks. Our framework reported better runtime than exhaustive framework. Finally, we reported the best configuration according to sensitivity analysis.

  7. Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

    Science.gov (United States)

    Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

    Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.

  8. Influence of Inter Carrier Interference on Link Adaptation Algorithms in OFDM Systems

    DEFF Research Database (Denmark)

    Das, Suvra S.; Rahman, Muhammad Imadur; Wang, Yuanye

    2007-01-01

    The performance of Link Adaptation (LA) under the influence of inter carrier interference (ICI), which is cause by carrier frequency offset (CFO) and Doppler frequency spread due to mobility, in orthogonal frequency division multiplexing (OFDM) based wireless systems is analyzed in this work. LA...... maximizes throughput while maintaining a target bit error rate (BER)or Block Error Rate (BLER) at the receiver using the signal to noise ratio (SNR) fed back from the receiver to the transmitter. Since ICI power is proportional to the signal strength, the implications of such an impairment on LA OFDM...... can over the problem. It is also noted that ICI severely reduces the spectral efficiency of OFDM systems even when LA is used....

  9. Small Body GN&C Research Report: A Robust Model Predictive Control Algorithm with Guaranteed Resolvability

    Science.gov (United States)

    Acikmese, Behcet A.; Carson, John M., III

    2005-01-01

    A robustly stabilizing MPC (model predictive control) algorithm for uncertain nonlinear systems is developed that guarantees the resolvability of the associated finite-horizon optimal control problem in a receding-horizon implementation. The control consists of two components; (i) feedforward, and (ii) feedback part. Feed-forward control is obtained by online solution of a finite-horizon optimal control problem for the nominal system dynamics. The feedback control policy is designed off-line based on a bound on the uncertainty in the system model. The entire controller is shown to be robustly stabilizing with a region of attraction composed of initial states for which the finite-horizon optimal control problem is feasible. The controller design for this algorithm is demonstrated on a class of systems with uncertain nonlinear terms that have norm-bounded derivatives, and derivatives in polytopes. An illustrative numerical example is also provided.

  10. A Dantzig-Wolfe decomposition algorithm for linear economic model predictive control of dynamically decoupled subsystems

    DEFF Research Database (Denmark)

    Sokoler, Leo Emil; Standardi, Laura; Edlund, Kristian

    2014-01-01

    This paper presents a warm-started Dantzig–Wolfe decomposition algorithm tailored to economic model predictive control of dynamically decoupled subsystems. We formulate the constrained optimal control problem solved at each sampling instant as a linear program with state space constraints, input...... limits, input rate limits, and soft output limits. The objective function of the linear program is related directly to the cost of operating the subsystems, and the cost of violating the soft output constraints. Simulations for large-scale economic power dispatch problems show that the proposed algorithm...... is significantly faster than both state-of-the-art linear programming solvers, and a structure exploiting implementation of the alternating direction method of multipliers. It is also demonstrated that the control strategy presented in this paper can be tuned using a weighted ℓ1-regularization term...

  11. Investigation of Diesel’s Residual Noise on Predictive Vehicles Noise Cancelling using LMS Adaptive Algorithm

    Science.gov (United States)

    Arttini Dwi Prasetyowati, Sri; Susanto, Adhi; Widihastuti, Ida

    2017-04-01

    Every noise problems require different solution. In this research, the noise that must be cancelled comes from roadway. Least Mean Square (LMS) adaptive is one of the algorithm that can be used to cancel that noise. Residual noise always appears and could not be erased completely. This research aims to know the characteristic of residual noise from vehicle’s noise and analysis so that it is no longer appearing as a problem. LMS algorithm was used to predict the vehicle’s noise and minimize the error. The distribution of the residual noise could be observed to determine the specificity of the residual noise. The statistic of the residual noise close to normal distribution with = 0,0435, = 1,13 and the autocorrelation of the residual noise forming impulse. As a conclusion the residual noise is insignificant.

  12. Prediction of Aerodynamic Coefficients for Wind Tunnel Data using a Genetic Algorithm Optimized Neural Network

    Science.gov (United States)

    Rajkumar, T.; Aragon, Cecilia; Bardina, Jorge; Britten, Roy

    2002-01-01

    A fast, reliable way of predicting aerodynamic coefficients is produced using a neural network optimized by a genetic algorithm. Basic aerodynamic coefficients (e.g. lift, drag, pitching moment) are modelled as functions of angle of attack and Mach number. The neural network is first trained on a relatively rich set of data from wind tunnel tests of numerical simulations to learn an overall model. Most of the aerodynamic parameters can be well-fitted using polynomial functions. A new set of data, which can be relatively sparse, is then supplied to the network to produce a new model consistent with the previous model and the new data. Because the new model interpolates realistically between the sparse test data points, it is suitable for use in piloted simulations. The genetic algorithm is used to choose a neural network architecture to give best results, avoiding over-and under-fitting of the test data.

  13. Does a Diagnostic Classification Algorithm Help to Predict the Course of Low Back Pain?

    DEFF Research Database (Denmark)

    Hartvigsen, Lisbeth; Kongsted, Alice; Vach, Werner

    2018-01-01

    ). Objectives To investigate if a diagnostic classification algorithm is associated with activity limitation and LBP intensity at 2-week and 3-month follow up, and 1-year trajectories of LBP intensity, and if it improves prediction of outcome when added to a set of known predictors. Methods 934 consecutive......Study Design A prospective observational study. Background A diagnostic classification algorithm was developed by Petersen et al., consisting of 12 categories based on a standardized examination protocol with the primary purpose of identifying clinically homogeneous subgroups of low back pain (LBP...... adult patients, with new episodes of LBP, who were visiting chiropractic practices in primary care were categorized according to the Petersen classification. Outcomes were disability and pain intensity measured at 2 weeks and 3 months, and 1-year trajectories of LBP based on weekly responses to text...

  14. Hidden State Prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses

    Directory of Open Access Journals (Sweden)

    Jesse Robert Zaneveld

    2014-08-01

    Full Text Available Complex symbioses between animal or plant hosts and their associated microbiotas can involve thousands of species and millions of genes. Because of the number of interacting partners, it is often impractical to study all organisms or genes in these host-microbe symbioses individually. Yet new phylogenetic predictive methods can use the wealth of accumulated data on diverse model organisms to make inferences into the properties of less well-studied species and gene families. Predictive functional profiling methods use evolutionary models based on the properties of studied relatives to put bounds on the likely characteristics of an organism or gene that has not yet been studied in detail. These techniques have been applied to predict diverse features of host-associated microbial communities ranging from the enzymatic function of uncharacterized genes to the gene content of uncultured microorganisms. We consider these phylogenetically-informed predictive techniques from disparate fields as examples of a general class of algorithms for Hidden State Prediction (HSP, and argue that HSP methods have broad value in predicting organismal traits in a variety of contexts, including the study of complex host-microbe symbioses.

  15. Hidden state prediction: a modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses.

    Science.gov (United States)

    Zaneveld, Jesse R R; Thurber, Rebecca L V

    2014-01-01

    Complex symbioses between animal or plant hosts and their associated microbiotas can involve thousands of species and millions of genes. Because of the number of interacting partners, it is often impractical to study all organisms or genes in these host-microbe symbioses individually. Yet new phylogenetic predictive methods can use the wealth of accumulated data on diverse model organisms to make inferences into the properties of less well-studied species and gene families. Predictive functional profiling methods use evolutionary models based on the properties of studied relatives to put bounds on the likely characteristics of an organism or gene that has not yet been studied in detail. These techniques have been applied to predict diverse features of host-associated microbial communities ranging from the enzymatic function of uncharacterized genes to the gene content of uncultured microorganisms. We consider these phylogenetically informed predictive techniques from disparate fields as examples of a general class of algorithms for Hidden State Prediction (HSP), and argue that HSP methods have broad value in predicting organismal traits in a variety of contexts, including the study of complex host-microbe symbioses.

  16. BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

    Science.gov (United States)

    Kaur, Harpreet; Raghava, G P S

    2002-03-01

    beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/

  17. A fast EM algorithm for BayesA-like prediction of genomic breeding values.

    Directory of Open Access Journals (Sweden)

    Xiaochen Sun

    Full Text Available Prediction accuracies of estimated breeding values for economically important traits are expected to benefit from genomic information. Single nucleotide polymorphism (SNP panels used in genomic prediction are increasing in density, but the Markov Chain Monte Carlo (MCMC estimation of SNP effects can be quite time consuming or slow to converge when a large number of SNPs are fitted simultaneously in a linear mixed model. Here we present an EM algorithm (termed "fastBayesA" without MCMC. This fastBayesA approach treats the variances of SNP effects as missing data and uses a joint posterior mode of effects compared to the commonly used BayesA which bases predictions on posterior means of effects. In each EM iteration, SNP effects are predicted as a linear combination of best linear unbiased predictions of breeding values from a mixed linear animal model that incorporates a weighted marker-based realized relationship matrix. Method fastBayesA converges after a few iterations to a joint posterior mode of SNP effects under the BayesA model. When applied to simulated quantitative traits with a range of genetic architectures, fastBayesA is shown to predict GEBV as accurately as BayesA but with less computing effort per SNP than BayesA. Method fastBayesA can be used as a computationally efficient substitute for BayesA, especially when an increasing number of markers bring unreasonable computational burden or slow convergence to MCMC approaches.

  18. An early-biomarker algorithm predicts lethal graft-versus-host disease and survival.

    Science.gov (United States)

    Hartwell, Matthew J; Özbek, Umut; Holler, Ernst; Renteria, Anne S; Major-Monfried, Hannah; Reddy, Pavan; Aziz, Mina; Hogan, William J; Ayuk, Francis; Efebera, Yvonne A; Hexner, Elizabeth O; Bunworasate, Udomsak; Qayed, Muna; Ordemann, Rainer; Wölfl, Matthias; Mielke, Stephan; Pawarode, Attaphol; Chen, Yi-Bin; Devine, Steven; Harris, Andrew C; Jagasia, Madan; Kitko, Carrie L; Litzow, Mark R; Kröger, Nicolaus; Locatelli, Franco; Morales, George; Nakamura, Ryotaro; Reshef, Ran; Rösler, Wolf; Weber, Daniela; Wudhikarn, Kitsada; Yanik, Gregory A; Levine, John E; Ferrara, James L M

    2017-02-09

    BACKGROUND. No laboratory test can predict the risk of nonrelapse mortality (NRM) or severe graft-versus-host disease (GVHD) after hematopoietic cellular transplantation (HCT) prior to the onset of GVHD symptoms. METHODS. Patient blood samples on day 7 after HCT were obtained from a multicenter set of 1,287 patients, and 620 samples were assigned to a training set. We measured the concentrations of 4 GVHD biomarkers (ST2, REG3α, TNFR1, and IL-2Rα) and used them to model 6-month NRM using rigorous cross-validation strategies to identify the best algorithm that defined 2 distinct risk groups. We then applied the final algorithm in an independent test set ( n = 309) and validation set ( n = 358). RESULTS. A 2-biomarker model using ST2 and REG3α concentrations identified patients with a cumulative incidence of 6-month NRM of 28% in the high-risk group and 7% in the low-risk group ( P < 0.001). The algorithm performed equally well in the test set (33% vs. 7%, P < 0.001) and the multicenter validation set (26% vs. 10%, P < 0.001). Sixteen percent, 17%, and 20% of patients were at high risk in the training, test, and validation sets, respectively. GVHD-related mortality was greater in high-risk patients (18% vs. 4%, P < 0.001), as was severe gastrointestinal GVHD (17% vs. 8%, P < 0.001). The same algorithm can be successfully adapted to define 3 distinct risk groups at GVHD onset. CONCLUSION. A biomarker algorithm based on a blood sample taken 7 days after HCT can consistently identify a group of patients at high risk for lethal GVHD and NRM. FUNDING. The National Cancer Institute, American Cancer Society, and the Doris Duke Charitable Foundation.

  19. A new avenue for classification and prediction of olive cultivars using supervised and unsupervised algorithms.

    Directory of Open Access Journals (Sweden)

    Amir H Beiki

    Full Text Available Various methods have been used to identify cultivares of olive trees; herein we used different bioinformatics algorithms to propose new tools to classify 10 cultivares of olive based on RAPD and ISSR genetic markers datasets generated from PCR reactions. Five RAPD markers (OPA0a21, OPD16a, OP01a1, OPD16a1 and OPA0a8 and five ISSR markers (UBC841a4, UBC868a7, UBC841a14, U12BC807a and UBC810a13 selected as the most important markers by all attribute weighting models. K-Medoids unsupervised clustering run on SVM dataset was fully able to cluster each olive cultivar to the right classes. All trees (176 induced by decision tree models generated meaningful trees and UBC841a4 attribute clearly distinguished between foreign and domestic olive cultivars with 100% accuracy. Predictive machine learning algorithms (SVM and Naïve Bayes were also able to predict the right class of olive cultivares with 100% accuracy. For the first time, our results showed data mining techniques can be effectively used to distinguish between plant cultivares and proposed machine learning based systems in this study can predict new olive cultivars with the best possible accuracy.

  20. Development of Predictive QSAR Models of 4-Thiazolidinones Antitrypanosomal Activity using Modern Machine Learning Algorithms.

    Science.gov (United States)

    Kryshchyshyn, Anna; Devinyak, Oleg; Kaminskyy, Danylo; Grellier, Philippe; Lesyk, Roman

    2017-11-14

    This paper presents novel QSAR models for the prediction of antitrypanosomal activity among thiazolidines and related heterocycles. The performance of four machine learning algorithms: Random Forest regression, Stochastic gradient boosting, Multivariate adaptive regression splines and Gaussian processes regression have been studied in order to reach better levels of predictivity. The results for Random Forest and Gaussian processes regression are comparable and outperform other studied methods. The preliminary descriptor selection with Boruta method improved the outcome of machine learning methods. The two novel QSAR-models developed with Random Forest and Gaussian processes regression algorithms have good predictive ability, which was proved by the external evaluation of the test set with corresponding Q 2 ext =0.812 and Q 2 ext =0.830. The obtained models can be used further for in silico screening of virtual libraries in the same chemical domain in order to find new antitrypanosomal agents. Thorough analysis of descriptors influence in the QSAR models and interpretation of their chemical meaning allows to highlight a number of structure-activity relationships. The presence of phenyl rings with electron-withdrawing atoms or groups in para-position, increased number of aromatic rings, high branching but short chains, high HOMO energy, and the introduction of 1-substituted 2-indolyl fragment into the molecular structure have been recognized as trypanocidal activity prerequisites. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. A parallel adaptive mesh refinement algorithm for predicting turbulent non-premixed combusting flows

    International Nuclear Information System (INIS)

    Gao, X.; Groth, C.P.T.

    2005-01-01

    A parallel adaptive mesh refinement (AMR) algorithm is proposed for predicting turbulent non-premixed combusting flows characteristic of gas turbine engine combustors. The Favre-averaged Navier-Stokes equations governing mixture and species transport for a reactive mixture of thermally perfect gases in two dimensions, the two transport equations of the κ-ψ turbulence model, and the time-averaged species transport equations, are all solved using a fully coupled finite-volume formulation. A flexible block-based hierarchical data structure is used to maintain the connectivity of the solution blocks in the multi-block mesh and facilitate automatic solution-directed mesh adaptation according to physics-based refinement criteria. This AMR approach allows for anisotropic mesh refinement and the block-based data structure readily permits efficient and scalable implementations of the algorithm on multi-processor architectures. Numerical results for turbulent non-premixed diffusion flames, including cold- and hot-flow predictions for a bluff body burner, are described and compared to available experimental data. The numerical results demonstrate the validity and potential of the parallel AMR approach for predicting complex non-premixed turbulent combusting flows. (author)

  2. Diagnosis and prediction of periodontally compromised teeth using a deep learning-based convolutional neural network algorithm.

    Science.gov (United States)

    Lee, Jae-Hong; Kim, Do-Hyung; Jeong, Seong-Nyum; Choi, Seong-Ho

    2018-04-01

    The aim of the current study was to develop a computer-assisted detection system based on a deep convolutional neural network (CNN) algorithm and to evaluate the potential usefulness and accuracy of this system for the diagnosis and prediction of periodontally compromised teeth (PCT). Combining pretrained deep CNN architecture and a self-trained network, periapical radiographic images were used to determine the optimal CNN algorithm and weights. The diagnostic and predictive accuracy, sensitivity, specificity, positive predictive value, negative predictive value, receiver operating characteristic (ROC) curve, area under the ROC curve, confusion matrix, and 95% confidence intervals (CIs) were calculated using our deep CNN algorithm, based on a Keras framework in Python. The periapical radiographic dataset was split into training (n=1,044), validation (n=348), and test (n=348) datasets. With the deep learning algorithm, the diagnostic accuracy for PCT was 81.0% for premolars and 76.7% for molars. Using 64 premolars and 64 molars that were clinically diagnosed as severe PCT, the accuracy of predicting extraction was 82.8% (95% CI, 70.1%-91.2%) for premolars and 73.4% (95% CI, 59.9%-84.0%) for molars. We demonstrated that the deep CNN algorithm was useful for assessing the diagnosis and predictability of PCT. Therefore, with further optimization of the PCT dataset and improvements in the algorithm, a computer-aided detection system can be expected to become an effective and efficient method of diagnosing and predicting PCT.

  3. Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

    Directory of Open Access Journals (Sweden)

    Wu Chi-Yeh

    2010-01-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention. Results This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set. Conclusion Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G

  4. Improved feature selection based on genetic algorithms for real time disruption prediction on JET

    Energy Technology Data Exchange (ETDEWEB)

    Ratta, G.A., E-mail: garatta@gateme.unsj.edu.ar [GATEME, Facultad de Ingenieria, Universidad Nacional de San Juan, Avda. San Martin 1109 (O), 5400 San Juan (Argentina); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom); Vega, J. [Asociacion EURATOM/CIEMAT para Fusion, Avda. Complutense, 40, 28040 Madrid (Spain); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom); Murari, A. [Associazione EURATOM-ENEA per la Fusione, Consorzio RFX, 4-35127 Padova (Italy); JET EFDA, Culham Science Centre, OX14 3DB Abingdon (United Kingdom)

    2012-09-15

    Highlights: Black-Right-Pointing-Pointer A new signal selection methodology to improve disruption prediction is reported. Black-Right-Pointing-Pointer The approach is based on Genetic Algorithms. Black-Right-Pointing-Pointer An advanced predictor has been created with the new set of signals. Black-Right-Pointing-Pointer The new system obtains considerably higher prediction rates. - Abstract: The early prediction of disruptions is an important aspect of the research in the field of Tokamak control. A very recent predictor, called 'Advanced Predictor Of Disruptions' (APODIS), developed for the 'Joint European Torus' (JET), implements the real time recognition of incoming disruptions with the best success rate achieved ever and an outstanding stability for long periods following training. In this article, a new methodology to select the set of the signals' parameters in order to maximize the performance of the predictor is reported. The approach is based on 'Genetic Algorithms' (GAs). With the feature selection derived from GAs, a new version of APODIS has been developed. The results are significantly better than the previous version not only in terms of success rates but also in extending the interval before the disruption in which reliable predictions are achieved. Correct disruption predictions with a success rate in excess of 90% have been achieved 200 ms before the time of the disruption. The predictor response is compared with that of JET's Protection System (JPS) and the ADODIS predictor is shown to be far superior. Both systems have been carefully tested with a wide number of discharges to understand their relative merits and the most profitable directions of further improvements.

  5. Small angle X-ray scattering and cross-linking for data assisted protein structure prediction in CASP 12 with prospects for improved accuracy

    KAUST Repository

    Ogorzalek, Tadeusz L.

    2018-01-04

    Experimental data offers empowering constraints for structure prediction. These constraints can be used to filter equivalently scored models or more powerfully within optimization functions toward prediction. In CASP12, Small Angle X-ray Scattering (SAXS) and Cross-Linking Mass Spectrometry (CLMS) data, measured on an exemplary set of novel fold targets, were provided to the CASP community of protein structure predictors. As HT, solution-based techniques, SAXS and CLMS can efficiently measure states of the full-length sequence in its native solution conformation and assembly. However, this experimental data did not substantially improve prediction accuracy judged by fits to crystallographic models. One issue, beyond intrinsic limitations of the algorithms, was a disconnect between crystal structures and solution-based measurements. Our analyses show that many targets had substantial percentages of disordered regions (up to 40%) or were multimeric or both. Thus, solution measurements of flexibility and assembly support variations that may confound prediction algorithms trained on crystallographic data and expecting globular fully-folded monomeric proteins. Here, we consider the CLMS and SAXS data collected, the information in these solution measurements, and the challenges in incorporating them into computational prediction. As improvement opportunities were only partly realized in CASP12, we provide guidance on how data from the full-length biological unit and the solution state can better aid prediction of the folded monomer or subunit. We furthermore describe strategic integrations of solution measurements with computational prediction programs with the aim of substantially improving foundational knowledge and the accuracy of computational algorithms for biologically-relevant structure predictions for proteins in solution. This article is protected by copyright. All rights reserved.

  6. Small angle X-ray scattering and cross-linking for data assisted protein structure prediction in CASP 12 with prospects for improved accuracy

    KAUST Repository

    Ogorzalek, Tadeusz L.; Hura, Greg L.; Belsom, Adam; Burnett, Kathryn H.; Kryshtafovych, Andriy; Tainer, John A.; Rappsilber, Juri; Tsutakawa, Susan E.; Fidelis, Krzysztof

    2018-01-01

    Experimental data offers empowering constraints for structure prediction. These constraints can be used to filter equivalently scored models or more powerfully within optimization functions toward prediction. In CASP12, Small Angle X-ray Scattering (SAXS) and Cross-Linking Mass Spectrometry (CLMS) data, measured on an exemplary set of novel fold targets, were provided to the CASP community of protein structure predictors. As HT, solution-based techniques, SAXS and CLMS can efficiently measure states of the full-length sequence in its native solution conformation and assembly. However, this experimental data did not substantially improve prediction accuracy judged by fits to crystallographic models. One issue, beyond intrinsic limitations of the algorithms, was a disconnect between crystal structures and solution-based measurements. Our analyses show that many targets had substantial percentages of disordered regions (up to 40%) or were multimeric or both. Thus, solution measurements of flexibility and assembly support variations that may confound prediction algorithms trained on crystallographic data and expecting globular fully-folded monomeric proteins. Here, we consider the CLMS and SAXS data collected, the information in these solution measurements, and the challenges in incorporating them into computational prediction. As improvement opportunities were only partly realized in CASP12, we provide guidance on how data from the full-length biological unit and the solution state can better aid prediction of the folded monomer or subunit. We furthermore describe strategic integrations of solution measurements with computational prediction programs with the aim of substantially improving foundational knowledge and the accuracy of computational algorithms for biologically-relevant structure predictions for proteins in solution. This article is protected by copyright. All rights reserved.

  7. Predicting How Close Near-Earth Asteroids Will Come to Earth in the Next Five Years Using Only Kepler's Algorithm

    National Research Council Canada - National Science Library

    Wright, Melissa

    1998-01-01

    .... The goal of th is investigation was to see if using only Kepler's algorithm, which ignores the gravitational pull of other planets, our moon, and Jupiter, was sufficient to predict close encounters with Earth...

  8. Secondary Structure Prediction of Protein using Resilient Back Propagation Learning Algorithm

    Directory of Open Access Journals (Sweden)

    Jyotshna Dongardive

    2015-12-01

    Full Text Available The paper proposes a neural network based approach to predict secondary structure of protein. It uses Multilayer Feed Forward Network (MLFN with resilient back propagation as the learning algorithm. Point Accepted Mutation (PAM is adopted as the encoding scheme and CB396 data set is used for the training and testing of the network. Overall accuracy of the network has been experimentally calculated with different window sizes for the sliding window scheme and by varying the number of units in the hidden layer. The best results were obtained with eleven as the window size and seven as the number of units in the hidden layer.

  9. Predicting Sepsis Risk Using the "Sniffer" Algorithm in the Electronic Medical Record.

    Science.gov (United States)

    Olenick, Evelyn M; Zimbro, Kathie S; DʼLima, Gabrielle M; Ver Schneider, Patricia; Jones, Danielle

    The Sepsis "Sniffer" Algorithm (SSA) has merit as a digital sepsis alert but should be considered an adjunct to versus an alternative for the Nurse Screening Tool (NST), given lower specificity and positive predictive value. The SSA reduced the risk of incorrectly categorizing patients at low risk for sepsis, detected sepsis high risk in half the time, and reduced redundant NST screens by 70% and manual screening hours by 64% to 72%. Preserving nurse hours expended on manual sepsis alerts may translate into time directed toward other patient priorities.

  10. QIKAIM, a fast seminumerical algorithm for the generation of minute-of-arc accuracy satellite predictions

    Science.gov (United States)

    Vermeer, M.

    1981-07-01

    A program was designed to replace AIMLASER for the generation of aiming predictions, to achieve a major saving in computing time, and to keep the program small enough for use even on small systems. An approach was adopted that incorporated the numerical integration of the orbit through a pass, limiting the computation of osculating elements to only one point per pass. The numerical integration method which is fourth order in delta t in the cumulative error after a given time lapse is presented. Algorithms are explained and a flowchart and listing of the program are provided.

  11. Forecasting pulsatory motion for non-invasive cardiac radiosurgery: an analysis of algorithms from respiratory motion prediction.

    Science.gov (United States)

    Ernst, Floris; Bruder, Ralf; Schlaefer, Alexander; Schweikard, Achim

    2011-01-01

    Recently, radiosurgical treatment of cardiac arrhythmia, especially atrial fibrillation, has been proposed. Using the CyberKnife, focussed radiation will be used to create ablation lines on the beating heart to block unwanted electrical activity. Since this procedure requires high accuracy, the inevitable latency of the system (i.e., the robotic manipulator following the motion of the heart) has to be compensated for. We examine the applicability of prediction algorithms developed for respiratory motion prediction to the prediction of pulsatory motion. We evaluated the MULIN, nLMS, wLMS, SVRpred and EKF algorithms. The test data used has been recorded using external infrared position sensors, 3D ultrasound and the NavX catheter systems. With this data, we have shown that the error from latency can be reduced by at least 10 and as much as 75% (44% average), depending on the type of signal. It has also been shown that, although the SVRpred algorithm was successful in most cases, it was outperformed by the simple nLMS algorithm, the EKF or the wLMS algorithm in a number of cases. We have shown that prediction of cardiac motion is possible and that the algorithms known from respiratory motion prediction are applicable. Since pulsation is more regular than respiration, more research will have to be done to improve frequency-tracking algorithms, like the EKF method, which performed better than expected from their behaviour on respiratory motion traces.

  12. PREDICTION OF WATER QUALITY INDEX USING BACK PROPAGATION NETWORK ALGORITHM. CASE STUDY: GOMBAK RIVER

    Directory of Open Access Journals (Sweden)

    FARIS GORASHI

    2012-08-01

    Full Text Available The aim of this study is to enable prediction of water quality parameters with conjunction to land use attributes and to find a low-end alternative for water quality monitoring techniques, which are typically expensive and tedious. It also aims to ensure sustainable development, which is essentially has effects on water quality. The research approach followed in this study is via using artificial neural networks, and geographical information system to provide a reliable prediction model. Back propagation network algorithm was used for the purpose of this study. The proposed approach minimized most of anomalies associated with prediction methods and provided water quality prediction with precision. The study used 5 hidden nodes in this network. The network was optimized to complete 23145 cycles before it reaches the best error of 0.65. Stations 18 had shown the greatest fluctuation among the three stations as it reflects an area of on-going rapid development of Gombak river watershed. The results had shown a very close prediction with best error of 0.67 in a sensitivity test that was carried afterwards.

  13. A combination of compositional index and genetic algorithm for predicting transmembrane helical segments.

    Directory of Open Access Journals (Sweden)

    Nazar Zaki

    Full Text Available Transmembrane helix (TMH topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method.The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm.

  14. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations.

    Science.gov (United States)

    Zong, Nansu; Kim, Hyeoneui; Ngo, Victoria; Harismendy, Olivier

    2017-08-01

    A heterogeneous network topology possessing abundant interactions between biomedical entities has yet to be utilized in similarity-based methods for predicting drug-target associations based on the array of varying features of drugs and their targets. Deep learning reveals features of vertices of a large network that can be adapted in accommodating the similarity-based solutions to provide a flexible method of drug-target prediction. We propose a similarity-based drug-target prediction method that enhances existing association discovery methods by using a topology-based similarity measure. DeepWalk, a deep learning method, is adopted in this study to calculate the similarities within Linked Tripartite Network (LTN), a heterogeneous network generated from biomedical linked datasets. This proposed method shows promising results for drug-target association prediction: 98.96% AUC ROC score with a 10-fold cross-validation and 99.25% AUC ROC score with a Monte Carlo cross-validation with LTN. By utilizing DeepWalk, we demonstrate that: (i) this method outperforms other existing topology-based similarity computation methods, (ii) the performance is better for tripartite than with bipartite networks and (iii) the measure of similarity using network topology outperforms the ones derived from chemical structure (drugs) or genomic sequence (targets). Our proposed methodology proves to be capable of providing a promising solution for drug-target prediction based on topological similarity with a heterogeneous network, and may be readily re-purposed and adapted in the existing of similarity-based methodologies. The proposed method has been developed in JAVA and it is available, along with the data at the following URL: https://github.com/zongnansu1982/drug-target-prediction . nazong@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. A Rule-Based Model for Bankruptcy Prediction Based on an Improved Genetic Ant Colony Algorithm

    Directory of Open Access Journals (Sweden)

    Yudong Zhang

    2013-01-01

    Full Text Available In this paper, we proposed a hybrid system to predict corporate bankruptcy. The whole procedure consists of the following four stages: first, sequential forward selection was used to extract the most important features; second, a rule-based model was chosen to fit the given dataset since it can present physical meaning; third, a genetic ant colony algorithm (GACA was introduced; the fitness scaling strategy and the chaotic operator were incorporated with GACA, forming a new algorithm—fitness-scaling chaotic GACA (FSCGACA, which was used to seek the optimal parameters of the rule-based model; and finally, the stratified K-fold cross-validation technique was used to enhance the generalization of the model. Simulation experiments of 1000 corporations’ data collected from 2006 to 2009 demonstrated that the proposed model was effective. It selected the 5 most important factors as “net income to stock broker’s equality,” “quick ratio,” “retained earnings to total assets,” “stockholders’ equity to total assets,” and “financial expenses to sales.” The total misclassification error of the proposed FSCGACA was only 7.9%, exceeding the results of genetic algorithm (GA, ant colony algorithm (ACA, and GACA. The average computation time of the model is 2.02 s.

  16. Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.

    Science.gov (United States)

    Bai, Li-Yue; Dai, Hao; Xu, Qin; Junaid, Muhammad; Peng, Shao-Liang; Zhu, Xiaolei; Xiong, Yi; Wei, Dong-Qing

    2018-02-05

    Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.

  17. Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm

    Directory of Open Access Journals (Sweden)

    Li-Yue Bai

    2018-02-01

    Full Text Available Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.

  18. Dynamic Heat Supply Prediction Using Support Vector Regression Optimized by Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Meiping Wang

    2016-01-01

    Full Text Available We developed an effective intelligent model to predict the dynamic heat supply of heat source. A hybrid forecasting method was proposed based on support vector regression (SVR model-optimized particle swarm optimization (PSO algorithms. Due to the interaction of meteorological conditions and the heating parameters of heating system, it is extremely difficult to forecast dynamic heat supply. Firstly, the correlations among heat supply and related influencing factors in the heating system were analyzed through the correlation analysis of statistical theory. Then, the SVR model was employed to forecast dynamic heat supply. In the model, the input variables were selected based on the correlation analysis and three crucial parameters, including the penalties factor, gamma of the kernel RBF, and insensitive loss function, were optimized by PSO algorithms. The optimized SVR model was compared with the basic SVR, optimized genetic algorithm-SVR (GA-SVR, and artificial neural network (ANN through six groups of experiment data from two heat sources. The results of the correlation coefficient analysis revealed the relationship between the influencing factors and the forecasted heat supply and determined the input variables. The performance of the PSO-SVR model is superior to those of the other three models. The PSO-SVR method is statistically robust and can be applied to practical heating system.

  19. Wind Power Grid Connected Capacity Prediction Using LSSVM Optimized by the Bat Algorithm

    Directory of Open Access Journals (Sweden)

    Qunli Wu

    2015-12-01

    Full Text Available Given the stochastic nature of wind, wind power grid-connected capacity prediction plays an essential role in coping with the challenge of balancing supply and demand. Accurate forecasting methods make enormous contribution to mapping wind power strategy, power dispatching and sustainable development of wind power industry. This study proposes a bat algorithm (BA–least squares support vector machine (LSSVM hybrid model to improve prediction performance. In order to select input of LSSVM effectively, Stationarity, Cointegration and Granger causality tests are conducted to examine the influence of installed capacity with different lags, and partial autocorrelation analysis is employed to investigate the inner relationship of grid-connected capacity. The parameters in LSSVM are optimized by BA to validate the learning ability and generalization of LSSVM. Multiple model sufficiency evaluation methods are utilized. The research results reveal that the accuracy improvement of the present approach can reach about 20% compared to other single or hybrid models.

  20. Application of genetic algorithm - multiple linear regressions to predict the activity of RSK inhibitors

    Directory of Open Access Journals (Sweden)

    Avval Zhila Mohajeri

    2015-01-01

    Full Text Available This paper deals with developing a linear quantitative structure-activity relationship (QSAR model for predicting the RSK inhibition activity of some new compounds. A dataset consisting of 62 pyrazino [1,2-α] indole, diazepino [1,2-α] indole, and imidazole derivatives with known inhibitory activities was used. Multiple linear regressions (MLR technique combined with the stepwise (SW and the genetic algorithm (GA methods as variable selection tools was employed. For more checking stability, robustness and predictability of the proposed models, internal and external validation techniques were used. Comparison of the results obtained, indicate that the GA-MLR model is superior to the SW-MLR model and that it isapplicable for designing novel RSK inhibitors.

  1. Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure

    Science.gov (United States)

    Asencio-Cortés, G.; Morales-Esteban, A.; Shang, X.; Martínez-Álvarez, F.

    2018-06-01

    Earthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2 O library in R language and Amazon cloud infrastructure were been used, reporting very promising results.

  2. Earthquake Prediction Analysis Based on Empirical Seismic Rate: The M8 Algorithm

    International Nuclear Information System (INIS)

    Molchan, G.; Romashkova, L.

    2010-07-01

    The quality of space-time earthquake prediction is usually characterized by a two-dimensional error diagram (n,τ), where n is the rate of failures-to-predict and τ is the normalized measure of space-time alarm. The most reasonable space measure for analysis of a prediction strategy is the rate of target events λ(dg) in a sub-area dg. In that case the quantity H = 1-(n +τ) determines the prediction capability of the strategy. The uncertainty of λ(dg) causes difficulties in estimating H and the statistical significance, α, of prediction results. We investigate this problem theoretically and show how the uncertainty of the measure can be taken into account in two situations, viz., the estimation of α and the construction of a confidence zone for the (n,τ)-parameters of the random strategies. We use our approach to analyse the results from prediction of M ≥ 8.0 events by the M8 method for the period 1985-2009 (the M8.0+ test). The model of λ(dg) based on the events Mw ≥ 5.5, 1977-2004, and the magnitude range of target events 8.0 ≤ M < 8.5 are considered as basic to this M8 analysis. We find the point and upper estimates of α and show that they are still unstable because the number of target events in the experiment is small. However, our results argue in favour of non-triviality of the M8 prediction algorithm. (author)

  3. Earthquake prediction analysis based on empirical seismic rate: the M8 algorithm

    Science.gov (United States)

    Molchan, G.; Romashkova, L.

    2010-12-01

    The quality of space-time earthquake prediction is usually characterized by a 2-D error diagram (n, τ), where n is the fraction of failures-to-predict and τ is the local rate of alarm averaged in space. The most reasonable averaging measure for analysis of a prediction strategy is the normalized rate of target events λ(dg) in a subarea dg. In that case the quantity H = 1 - (n + τ) determines the prediction capability of the strategy. The uncertainty of λ(dg) causes difficulties in estimating H and the statistical significance, α, of prediction results. We investigate this problem theoretically and show how the uncertainty of the measure can be taken into account in two situations, viz., the estimation of α and the construction of a confidence zone for the (n, τ)-parameters of the random strategies. We use our approach to analyse the results from prediction of M >= 8.0 events by the M8 method for the period 1985-2009 (the M8.0+ test). The model of λ(dg) based on the events Mw >= 5.5, 1977-2004, and the magnitude range of target events 8.0 <= M < 8.5 are considered as basic to this M8 analysis. We find the point and upper estimates of α and show that they are still unstable because the number of target events in the experiment is small. However, our results argue in favour of non-triviality of the M8 prediction algorithm.

  4. Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.

    Science.gov (United States)

    Jiménez, Fernando; Sánchez, Gracia; Juárez, José M

    2014-03-01

    This paper presents a novel rule-based fuzzy classification methodology for survival/mortality prediction in severe burnt patients. Due to the ethical aspects involved in this medical scenario, physicians tend not to accept a computer-based evaluation unless they understand why and how such a recommendation is given. Therefore, any fuzzy classifier model must be both accurate and interpretable. The proposed methodology is a three-step process: (1) multi-objective constrained optimization of a patient's data set, using Pareto-based elitist multi-objective evolutionary algorithms to maximize accuracy and minimize the complexity (number of rules) of classifiers, subject to interpretability constraints; this step produces a set of alternative (Pareto) classifiers; (2) linguistic labeling, which assigns a linguistic label to each fuzzy set of the classifiers; this step is essential to the interpretability of the classifiers; (3) decision making, whereby a classifier is chosen, if it is satisfactory, according to the preferences of the decision maker. If no classifier is satisfactory for the decision maker, the process starts again in step (1) with a different input parameter set. The performance of three multi-objective evolutionary algorithms, niched pre-selection multi-objective algorithm, elitist Pareto-based multi-objective evolutionary algorithm for diversity reinforcement (ENORA) and the non-dominated sorting genetic algorithm (NSGA-II), was tested using a patient's data set from an intensive care burn unit and a standard machine learning data set from an standard machine learning repository. The results are compared using the hypervolume multi-objective metric. Besides, the results have been compared with other non-evolutionary techniques and validated with a multi-objective cross-validation technique. Our proposal improves the classification rate obtained by other non-evolutionary techniques (decision trees, artificial neural networks, Naive Bayes, and case

  5. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    Directory of Open Access Journals (Sweden)

    Daqing Zhang

    2015-01-01

    Full Text Available Blood-brain barrier (BBB is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration.

  6. Design of a fuzzy differential evolution algorithm to predict non-deposition sediment transport

    Science.gov (United States)

    Ebtehaj, Isa; Bonakdari, Hossein

    2017-12-01

    Since the flow entering a sewer contains solid matter, deposition at the bottom of the channel is inevitable. It is difficult to understand the complex, three-dimensional mechanism of sediment transport in sewer pipelines. Therefore, a method to estimate the limiting velocity is necessary for optimal designs. Due to the inability of gradient-based algorithms to train Adaptive Neuro-Fuzzy Inference Systems (ANFIS) for non-deposition sediment transport prediction, a new hybrid ANFIS method based on a differential evolutionary algorithm (ANFIS-DE) is developed. The training and testing performance of ANFIS-DE is evaluated using a wide range of dimensionless parameters gathered from the literature. The input combination used to estimate the densimetric Froude number ( Fr) parameters includes the volumetric sediment concentration ( C V ), ratio of median particle diameter to hydraulic radius ( d/R), ratio of median particle diameter to pipe diameter ( d/D) and overall friction factor of sediment ( λ s ). The testing results are compared with the ANFIS model and regression-based equation results. The ANFIS-DE technique predicted sediment transport at limit of deposition with lower root mean square error (RMSE = 0.323) and mean absolute percentage of error (MAPE = 0.065) and higher accuracy ( R 2 = 0.965) than the ANFIS model and regression-based equations.

  7. A Model Predictive Algorithm for Active Control of Nonlinear Noise Processes

    Directory of Open Access Journals (Sweden)

    Qi-Zhi Zhang

    2005-01-01

    Full Text Available In this paper, an improved nonlinear Active Noise Control (ANC system is achieved by introducing an appropriate secondary source. For ANC system to be successfully implemented, the nonlinearity of the primary path and time delay of the secondary path must be overcome. A nonlinear Model Predictive Control (MPC strategy is introduced to deal with the time delay in the secondary path and the nonlinearity in the primary path of the ANC system. An overall online modeling technique is utilized for online secondary path and primary path estimation. The secondary path is estimated using an adaptive FIR filter, and the primary path is estimated using a Neural Network (NN. The two models are connected in parallel with the two paths. In this system, the mutual disturbances between the operation of the nonlinear ANC controller and modeling of the secondary can be greatly reduced. The coefficients of the adaptive FIR filter and weight vector of NN are adjusted online. Computer simulations are carried out to compare the proposed nonlinear MPC method with the nonlinear Filter-x Least Mean Square (FXLMS algorithm. The results showed that the convergence speed of the proposed nonlinear MPC algorithm is faster than that of nonlinear FXLMS algorithm. For testing the robust performance of the proposed nonlinear ANC system, the sudden changes in the secondary path and primary path of the ANC system are considered. Results indicated that the proposed nonlinear ANC system can rapidly track the sudden changes in the acoustic paths of the nonlinear ANC system, and ensure the adaptive algorithm stable when the nonlinear ANC system is time variable.

  8. Accuracy test for link prediction in terms of similarity index: The case of WS and BA models

    Science.gov (United States)

    Ahn, Min-Woo; Jung, Woo-Sung

    2015-07-01

    Link prediction is a technique that uses the topological information in a given network to infer the missing links in it. Since past research on link prediction has primarily focused on enhancing performance for given empirical systems, negligible attention has been devoted to link prediction with regard to network models. In this paper, we thus apply link prediction to two network models: The Watts-Strogatz (WS) model and Barabási-Albert (BA) model. We attempt to gain a better understanding of the relation between accuracy and each network parameter (mean degree, the number of nodes and the rewiring probability in the WS model) through network models. Six similarity indices are used, with precision and area under the ROC curve (AUC) value as the accuracy metrics. We observe a positive correlation between mean degree and accuracy, and size independence of the AUC value.

  9. Influenza detection and prediction algorithms: comparative accuracy trial in Östergötland county, Sweden, 2008-2012.

    Science.gov (United States)

    Spreco, A; Eriksson, O; Dahlström, Ö; Timpka, T

    2017-07-01

    Methods for the detection of influenza epidemics and prediction of their progress have seldom been comparatively evaluated using prospective designs. This study aimed to perform a prospective comparative trial of algorithms for the detection and prediction of increased local influenza activity. Data on clinical influenza diagnoses recorded by physicians and syndromic data from a telenursing service were used. Five detection and three prediction algorithms previously evaluated in public health settings were calibrated and then evaluated over 3 years. When applied on diagnostic data, only detection using the Serfling regression method and prediction using the non-adaptive log-linear regression method showed acceptable performances during winter influenza seasons. For the syndromic data, none of the detection algorithms displayed a satisfactory performance, while non-adaptive log-linear regression was the best performing prediction method. We conclude that evidence was found for that available algorithms for influenza detection and prediction display satisfactory performance when applied on local diagnostic data during winter influenza seasons. When applied on local syndromic data, the evaluated algorithms did not display consistent performance. Further evaluations and research on combination of methods of these types in public health information infrastructures for 'nowcasting' (integrated detection and prediction) of influenza activity are warranted.

  10. MSD-MAP: A Network-Based Systems Biology Platform for Predicting Disease-Metabolite Links.

    Science.gov (United States)

    Wathieu, Henri; Issa, Naiem T; Mohandoss, Manisha; Byers, Stephen W; Dakshanamurthy, Sivanesan

    2017-01-01

    Cancer-associated metabolites result from cell-wide mechanisms of dysregulation. The field of metabolomics has sought to identify these aberrant metabolites as disease biomarkers, clues to understanding disease mechanisms, or even as therapeutic agents. This study was undertaken to reliably predict metabolites associated with colorectal, esophageal, and prostate cancers. Metabolite and disease biological action networks were compared in a computational platform called MSD-MAP (Multi Scale Disease-Metabolite Association Platform). Using differential gene expression analysis with patient-based RNAseq data from The Cancer Genome Atlas, genes up- or down-regulated in cancer compared to normal tissue were identified. Relational databases were used to map biological entities including pathways, functions, and interacting proteins, to those differential disease genes. Similar relational maps were built for metabolites, stemming from known and in silico predicted metabolite-protein associations. The hypergeometric test was used to find statistically significant relationships between disease and metabolite biological signatures at each tier, and metabolites were assessed for multi-scale association with each cancer. Metabolite networks were also directly associated with various other diseases using a disease functional perturbation database. Our platform recapitulated metabolite-disease links that have been empirically verified in the scientific literature, with network-based mapping of jointly-associated biological activity also matching known disease mechanisms. This was true for colorectal, esophageal, and prostate cancers, using metabolite action networks stemming from both predicted and known functional protein associations. By employing systems biology concepts, MSD-MAP reliably predicted known cancermetabolite links, and may serve as a predictive tool to streamline conventional metabolomic profiling methodologies. Copyright© Bentham Science Publishers; For any

  11. PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast.

    Science.gov (United States)

    Lai, Fu-Jou; Chang, Hong-Tsun; Wu, Wei-Sheng

    2015-01-01

    Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Allowing users to select eight existing performance indices and 15

  12. Predicting peptides binding to MHC class II molecules using multi-objective evolutionary algorithms

    Directory of Open Access Journals (Sweden)

    Feng Lin

    2007-11-01

    Full Text Available Abstract Background Peptides binding to Major Histocompatibility Complex (MHC class II molecules are crucial for initiation and regulation of immune responses. Predicting peptides that bind to a specific MHC molecule plays an important role in determining potential candidates for vaccines. The binding groove in class II MHC is open at both ends, allowing peptides longer than 9-mer to bind. Finding the consensus motif facilitating the binding of peptides to a MHC class II molecule is difficult because of different lengths of binding peptides and varying location of 9-mer binding core. The level of difficulty increases when the molecule is promiscuous and binds to a large number of low affinity peptides. In this paper, we propose two approaches using multi-objective evolutionary algorithms (MOEA for predicting peptides binding to MHC class II molecules. One uses the information from both binders and non-binders for self-discovery of motifs. The other, in addition, uses information from experimentally determined motifs for guided-discovery of motifs. Results The proposed methods are intended for finding peptides binding to MHC class II I-Ag7 molecule – a promiscuous binder to a large number of low affinity peptides. Cross-validation results across experiments on two motifs derived for I-Ag7 datasets demonstrate better generalization abilities and accuracies of the present method over earlier approaches. Further, the proposed method was validated and compared on two publicly available benchmark datasets: (1 an ensemble of qualitative HLA-DRB1*0401 peptide data obtained from five different sources, and (2 quantitative peptide data obtained for sixteen different alleles comprising of three mouse alleles and thirteen HLA alleles. The proposed method outperformed earlier methods on most datasets, indicating that it is well suited for finding peptides binding to MHC class II molecules. Conclusion We present two MOEA-based algorithms for finding motifs

  13. A Modified Spatiotemporal Fusion Algorithm Using Phenological Information for Predicting Reflectance of Paddy Rice in Southern China

    Directory of Open Access Journals (Sweden)

    Mengxue Liu

    2018-05-01

    Full Text Available Satellite data for studying surface dynamics in heterogeneous landscapes are missing due to frequent cloud contamination, low temporal resolution, and technological difficulties in developing satellites. A modified spatiotemporal fusion algorithm for predicting the reflectance of paddy rice is presented in this paper. The algorithm uses phenological information extracted from a moderate-resolution imaging spectroradiometer enhanced vegetation index time series to improve the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM. The algorithm is tested with satellite data on Yueyang City, China. The main contribution of the modified algorithm is the selection of similar neighborhood pixels by using phenological information to improve accuracy. Results show that the modified algorithm performs better than ESTARFM in visual inspection and quantitative metrics, especially for paddy rice. This modified algorithm provides not only new ideas for the improvement of spatiotemporal data fusion method, but also technical support for the generation of remote sensing data with high spatial and temporal resolution.

  14. Predicting Solar Flares Using SDO /HMI Vector Magnetic Data Products and the Random Forest Algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Chang; Deng, Na; Wang, Haimin [Space Weather Research Laboratory, New Jersey Institute of Technology, University Heights, Newark, NJ 07102-1982 (United States); Wang, Jason T. L., E-mail: chang.liu@njit.edu, E-mail: na.deng@njit.edu, E-mail: haimin.wang@njit.edu, E-mail: jason.t.wang@njit.edu [Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102-1982 (United States)

    2017-07-10

    Adverse space-weather effects can often be traced to solar flares, the prediction of which has drawn significant research interests. The Helioseismic and Magnetic Imager (HMI) produces full-disk vector magnetograms with continuous high cadence, while flare prediction efforts utilizing this unprecedented data source are still limited. Here we report results of flare prediction using physical parameters provided by the Space-weather HMI Active Region Patches (SHARP) and related data products. We survey X-ray flares that occurred from 2010 May to 2016 December and categorize their source regions into four classes (B, C, M, and X) according to the maximum GOES magnitude of flares they generated. We then retrieve SHARP-related parameters for each selected region at the beginning of its flare date to build a database. Finally, we train a machine-learning algorithm, called random forest (RF), to predict the occurrence of a certain class of flares in a given active region within 24 hr, evaluate the classifier performance using the 10-fold cross-validation scheme, and characterize the results using standard performance metrics. Compared to previous works, our experiments indicate that using the HMI parameters and RF is a valid method for flare forecasting with fairly reasonable prediction performance. To our knowledge, this is the first time that RF has been used to make multiclass predictions of solar flares. We also find that the total unsigned quantities of vertical current, current helicity, and flux near the polarity inversion line are among the most important parameters for classifying flaring regions into different classes.

  15. A New Class of Flare Prediction Algorithms: A Synthesis of Data, Pattern Recognition Algorithms, and First Principles Magnetohydrodynamics, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Researchers have been working on flare prediction for many decades. However, the best prediction result achieved by Falconer et al. for major flares, CMEs, and solar...

  16. Hazard Forecasting by MRI: A Prediction Algorithm of the First Kind

    Science.gov (United States)

    Lomnitz, C.

    2003-12-01

    Seismic gaps do not tell us when and where the next earthquake is due. We present new results on limited earthquake hazard prediction at plate boundaries. Our algorithm quantifies earthquake hazard in seismic gaps. The prediction window found for M7 is on the order of 50 km by 20 years (Lomnitz, 1996a). The earth is unstable with respect to small perturbations of the initial conditions. A prediction of the first kind is an estimate of the time evolution of a complex system with fixed boundary conditions in response to changes in the initial state, for example, weather prediction (Edward Lorenz, 1975; Hasselmann, 2002). We use the catalog of large world earthquakes as a proxy for the initial conditions. The MRI algorithm simulates the response of the system to updating the catalog. After a local stress transient dP the entropy decays as (grad dP)2 due to transient flows directed toward the epicenter. Healing is the thermodynamic process which resets the state of stress. It proceeds as a power law from the rupture boundary inwards, as in a wound. The half-life of a rupture is defined as the healing time which shrinks the size of a scar by half. Healed segments of plate boundary can rupture again. From observations in Chile, Mexico and Japan we find that the half-life of a seismic rupture is about 20 years, in agreement with seismic gap observations. The moment ratio MR is defined as the contrast between the cumulative regional moment release and the local moment deficiency at time t along the plate boundary. The procedure is called MRI. The findings: (1) MRI works; (2) major earthquakes match prominent peaks in the MRI graph; (3) important events (Central Chile 1985; Mexico 1985; Kobe 1995) match MRI peaks which began to emerge 10 to 20 years before the earthquake; (4) The emergence of peaks in MRI depends on earlier ruptures that occurred, not adjacent to but at 10 to 20 fault lengths from the epicentral region, in agreement with triggering effects. The hazard

  17. XTALOPT: An open-source evolutionary algorithm for crystal structure prediction

    Science.gov (United States)

    Lonie, David C.; Zurek, Eva

    2011-02-01

    The implementation and testing of XTALOPT, an evolutionary algorithm for crystal structure prediction, is outlined. We present our new periodic displacement (ripple) operator which is ideally suited to extended systems. It is demonstrated that hybrid operators, which combine two pure operators, reduce the number of duplicate structures in the search. This allows for better exploration of the potential energy surface of the system in question, while simultaneously zooming in on the most promising regions. A continuous workflow, which makes better use of computational resources as compared to traditional generation based algorithms, is employed. Various parameters in XTALOPT are optimized using a novel benchmarking scheme. XTALOPT is available under the GNU Public License, has been interfaced with various codes commonly used to study extended systems, and has an easy to use, intuitive graphical interface. Program summaryProgram title:XTALOPT Catalogue identifier: AEGX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v2.1 or later [1] No. of lines in distributed program, including test data, etc.: 36 849 No. of bytes in distributed program, including test data, etc.: 1 149 399 Distribution format: tar.gz Programming language: C++ Computer: PCs, workstations, or clusters Operating system: Linux Classification: 7.7 External routines: QT [2], OpenBabel [3], AVOGADRO [4], SPGLIB [8] and one of: VASP [5], PWSCF [6], GULP [7]. Nature of problem: Predicting the crystal structure of a system from its stoichiometry alone remains a grand challenge in computational materials science, chemistry, and physics. Solution method: Evolutionary algorithms are stochastic search techniques which use concepts from biological evolution in order to locate the global minimum on their potential energy surface. Our evolutionary algorithm, XTALOPT, is freely

  18. Enhancing Accuracy of Sediment Total Load Prediction Using Evolutionary Algorithms (Case Study: Gotoorchay River

    Directory of Open Access Journals (Sweden)

    K. Roshangar

    2016-09-01

    Full Text Available Introduction: Exact prediction of transported sediment rate by rivers in water resources projects is of utmost importance. Basically erosion and sediment transport process is one of the most complexes hydrodynamic. Although different studies have been developed on the application of intelligent models based on neural, they are not widely used because of lacking explicitness and complexity governing on choosing and architecting of proper network. In this study, a Genetic expression programming model (as an important branches of evolutionary algorithems for predicting of sediment load is selected and investigated as an intelligent approach along with other known classical and imperical methods such as Larsen´s equation, Engelund-Hansen´s equation and Bagnold´s equation. Materials and Methods: In this study, in order to improve explicit prediction of sediment load of Gotoorchay, located in Aras catchment, Northwestern Iran latitude: 38°24´33.3˝ and longitude: 44°46´13.2˝, genetic programming (GP and Genetic Algorithm (GA were applied. Moreover, the semi-empirical models for predicting of total sediment load and rating curve have been used. Finally all the methods were compared and the best ones were introduced. Two statistical measures were used to compare the performance of the different models, namely root mean square error (RMSE and determination coefficient (DC. RMSE and DC indicate the discrepancy between the observed and computed values. Results and Discussions: The statistical characteristics results obtained from the analysis of genetic programming method for both selected model groups indicated that the model 4 including the only discharge of the river, relative to other studied models had the highest DC and the least RMSE in the testing stage (DC= 0.907, RMSE= 0.067. Although there were several parameters applied in other models, these models were complicated and had weak results of prediction. Our results showed that the model 9

  19. Development of a generally applicable morphokinetic algorithm capable of predicting the implantation potential of embryos transferred on Day 3

    Science.gov (United States)

    Petersen, Bjørn Molt; Boel, Mikkel; Montag, Markus; Gardner, David K.

    2016-01-01

    STUDY QUESTION Can a generally applicable morphokinetic algorithm suitable for Day 3 transfers of time-lapse monitored embryos originating from different culture conditions and fertilization methods be developed for the purpose of supporting the embryologist's decision on which embryo to transfer back to the patient in assisted reproduction? SUMMARY ANSWER The algorithm presented here can be used independently of culture conditions and fertilization method and provides predictive power not surpassed by other published algorithms for ranking embryos according to their blastocyst formation potential. WHAT IS KNOWN ALREADY Generally applicable algorithms have so far been developed only for predicting blastocyst formation. A number of clinics have reported validated implantation prediction algorithms, which have been developed based on clinic-specific culture conditions and clinical environment. However, a generally applicable embryo evaluation algorithm based on actual implantation outcome has not yet been reported. STUDY DESIGN, SIZE, DURATION Retrospective evaluation of data extracted from a database of known implantation data (KID) originating from 3275 embryos transferred on Day 3 conducted in 24 clinics between 2009 and 2014. The data represented different culture conditions (reduced and ambient oxygen with various culture medium strategies) and fertilization methods (IVF, ICSI). The capability to predict blastocyst formation was evaluated on an independent set of morphokinetic data from 11 218 embryos which had been cultured to Day 5. PARTICIPANTS/MATERIALS, SETTING, METHODS The algorithm was developed by applying automated recursive partitioning to a large number of annotation types and derived equations, progressing to a five-fold cross-validation test of the complete data set and a validation test of different incubation conditions and fertilization methods. The results were expressed as receiver operating characteristics curves using the area under the

  20. Prediction of Aerodynamic Coefficient using Genetic Algorithm Optimized Neural Network for Sparse Data

    Science.gov (United States)

    Rajkumar, T.; Bardina, Jorge; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Wind tunnels use scale models to characterize aerodynamic coefficients, Wind tunnel testing can be slow and costly due to high personnel overhead and intensive power utilization. Although manual curve fitting can be done, it is highly efficient to use a neural network to define the complex relationship between variables. Numerical simulation of complex vehicles on the wide range of conditions required for flight simulation requires static and dynamic data. Static data at low Mach numbers and angles of attack may be obtained with simpler Euler codes. Static data of stalled vehicles where zones of flow separation are usually present at higher angles of attack require Navier-Stokes simulations which are costly due to the large processing time required to attain convergence. Preliminary dynamic data may be obtained with simpler methods based on correlations and vortex methods; however, accurate prediction of the dynamic coefficients requires complex and costly numerical simulations. A reliable and fast method of predicting complex aerodynamic coefficients for flight simulation I'S presented using a neural network. The training data for the neural network are derived from numerical simulations and wind-tunnel experiments. The aerodynamic coefficients are modeled as functions of the flow characteristics and the control surfaces of the vehicle. The basic coefficients of lift, drag and pitching moment are expressed as functions of angles of attack and Mach number. The modeled and training aerodynamic coefficients show good agreement. This method shows excellent potential for rapid development of aerodynamic models for flight simulation. Genetic Algorithms (GA) are used to optimize a previously built Artificial Neural Network (ANN) that reliably predicts aerodynamic coefficients. Results indicate that the GA provided an efficient method of optimizing the ANN model to predict aerodynamic coefficients. The reliability of the ANN using the GA includes prediction of aerodynamic

  1. REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH

    Directory of Open Access Journals (Sweden)

    R. SATISHKUMAR

    2017-01-01

    Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.

  2. Improved feature selection based on genetic algorithms for real time disruption prediction on JET

    International Nuclear Information System (INIS)

    Rattá, G.A.; Vega, J.; Murari, A.

    2012-01-01

    Highlights: ► A new signal selection methodology to improve disruption prediction is reported. ► The approach is based on Genetic Algorithms. ► An advanced predictor has been created with the new set of signals. ► The new system obtains considerably higher prediction rates. - Abstract: The early prediction of disruptions is an important aspect of the research in the field of Tokamak control. A very recent predictor, called “Advanced Predictor Of Disruptions” (APODIS), developed for the “Joint European Torus” (JET), implements the real time recognition of incoming disruptions with the best success rate achieved ever and an outstanding stability for long periods following training. In this article, a new methodology to select the set of the signals’ parameters in order to maximize the performance of the predictor is reported. The approach is based on “Genetic Algorithms” (GAs). With the feature selection derived from GAs, a new version of APODIS has been developed. The results are significantly better than the previous version not only in terms of success rates but also in extending the interval before the disruption in which reliable predictions are achieved. Correct disruption predictions with a success rate in excess of 90% have been achieved 200 ms before the time of the disruption. The predictor response is compared with that of JET's Protection System (JPS) and the ADODIS predictor is shown to be far superior. Both systems have been carefully tested with a wide number of discharges to understand their relative merits and the most profitable directions of further improvements.

  3. HomoTarget: a new algorithm for prediction of microRNA targets in Homo sapiens.

    Science.gov (United States)

    Ahmadi, Hamed; Ahmadi, Ali; Azimzadeh-Jamalkandi, Sadegh; Shoorehdeli, Mahdi Aliyari; Salehzadeh-Yazdi, Ali; Bidkhori, Gholamreza; Masoudi-Nejad, Ali

    2013-02-01

    MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. PEDLA: predicting enhancers with a deep learning-based algorithmic framework.

    Science.gov (United States)

    Liu, Feng; Li, Hao; Ren, Chao; Bo, Xiaochen; Shu, Wenjie

    2016-06-22

    Transcriptional enhancers are non-coding segments of DNA that play a central role in the spatiotemporal regulation of gene expression programs. However, systematically and precisely predicting enhancers remain a major challenge. Although existing methods have achieved some success in enhancer prediction, they still suffer from many issues. We developed a deep learning-based algorithmic framework named PEDLA (https://github.com/wenjiegroup/PEDLA), which can directly learn an enhancer predictor from massively heterogeneous data and generalize in ways that are mostly consistent across various cell types/tissues. We first trained PEDLA with 1,114-dimensional heterogeneous features in H1 cells, and demonstrated that PEDLA framework integrates diverse heterogeneous features and gives state-of-the-art performance relative to five existing methods for enhancer prediction. We further extended PEDLA to iteratively learn from 22 training cell types/tissues. Our results showed that PEDLA manifested superior performance consistency in both training and independent test sets. On average, PEDLA achieved 95.0% accuracy and a 96.8% geometric mean (GM) of sensitivity and specificity across 22 training cell types/tissues, as well as 95.7% accuracy and a 96.8% GM across 20 independent test cell types/tissues. Together, our work illustrates the power of harnessing state-of-the-art deep learning techniques to consistently identify regulatory elements at a genome-wide scale from massively heterogeneous data across diverse cell types/tissues.

  5. Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method

    Science.gov (United States)

    Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty

    2017-03-01

    Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.

  6. A hybrid genetic algorithm and linear regression for prediction of NOx emission in power generation plant

    International Nuclear Information System (INIS)

    Bunyamin, Muhammad Afif; Yap, Keem Siah; Aziz, Nur Liyana Afiqah Abdul; Tiong, Sheih Kiong; Wong, Shen Yuong; Kamal, Md Fauzan

    2013-01-01

    This paper presents a new approach of gas emission estimation in power generation plant using a hybrid Genetic Algorithm (GA) and Linear Regression (LR) (denoted as GA-LR). The LR is one of the approaches that model the relationship between an output dependant variable, y, with one or more explanatory variables or inputs which denoted as x. It is able to estimate unknown model parameters from inputs data. On the other hand, GA is used to search for the optimal solution until specific criteria is met causing termination. These results include providing good solutions as compared to one optimal solution for complex problems. Thus, GA is widely used as feature selection. By combining the LR and GA (GA-LR), this new technique is able to select the most important input features as well as giving more accurate prediction by minimizing the prediction errors. This new technique is able to produce more consistent of gas emission estimation, which may help in reducing population to the environment. In this paper, the study's interest is focused on nitrous oxides (NOx) prediction. The results of the experiment are encouraging.

  7. Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

    Science.gov (United States)

    Patel, Nihir; Wang, Jason T L

    2015-10-01

    Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.

  8. Development and verification of an analytical algorithm to predict absorbed dose distributions in ocular proton therapy using Monte Carlo simulations

    International Nuclear Information System (INIS)

    Koch, Nicholas C; Newhauser, Wayne D

    2010-01-01

    Proton beam radiotherapy is an effective and non-invasive treatment for uveal melanoma. Recent research efforts have focused on improving the dosimetric accuracy of treatment planning and overcoming the present limitation of relative analytical dose calculations. Monte Carlo algorithms have been shown to accurately predict dose per monitor unit (D/MU) values, but this has yet to be shown for analytical algorithms dedicated to ocular proton therapy, which are typically less computationally expensive than Monte Carlo algorithms. The objective of this study was to determine if an analytical method could predict absolute dose distributions and D/MU values for a variety of treatment fields like those used in ocular proton therapy. To accomplish this objective, we used a previously validated Monte Carlo model of an ocular nozzle to develop an analytical algorithm to predict three-dimensional distributions of D/MU values from pristine Bragg peaks and therapeutically useful spread-out Bragg peaks (SOBPs). Results demonstrated generally good agreement between the analytical and Monte Carlo absolute dose calculations. While agreement in the proximal region decreased for beams with less penetrating Bragg peaks compared with the open-beam condition, the difference was shown to be largely attributable to edge-scattered protons. A method for including this effect in any future analytical algorithm was proposed. Comparisons of D/MU values showed typical agreement to within 0.5%. We conclude that analytical algorithms can be employed to accurately predict absolute proton dose distributions delivered by an ocular nozzle.

  9. A Path Tracking Algorithm Using Future Prediction Control with Spike Detection for an Autonomous Vehicle Robot

    Directory of Open Access Journals (Sweden)

    Muhammad Aizzat Zakaria

    2013-08-01

    Full Text Available Trajectory tracking is an important aspect of autonomous vehicles. The idea behind trajectory tracking is the ability of the vehicle to follow a predefined path with zero steady state error. The difficulty arises due to the nonlinearity of vehicle dynamics. Therefore, this paper proposes a stable tracking control for an autonomous vehicle. An approach that consists of steering wheel control and lateral control is introduced. This control algorithm is used for a non-holonomic navigation problem, namely tracking a reference trajectory in a closed loop form. A proposed future prediction point control algorithm is used to calculate the vehicle's lateral error in order to improve the performance of the trajectory tracking. A feedback sensor signal from the steering wheel angle and yaw rate sensor is used as feedback information for the controller. The controller consists of a relationship between the future point lateral error, the linear velocity, the heading error and the reference yaw rate. This paper also introduces a spike detection algorithm to track the spike error that occurs during GPS reading. The proposed idea is to take the advantage of the derivative of the steering rate. This paper aims to tackle the lateral error problem by applying the steering control law to the vehicle, and proposes a new path tracking control method by considering the future coordinate of the vehicle and the future estimated lateral error. The effectiveness of the proposed controller is demonstrated by a simulation and a GPS experiment with noisy data. The approach used in this paper is not limited to autonomous vehicles alone since the concept of autonomous vehicle tracking can be used in mobile robot platforms, as the kinematic model of these two platforms is similar.

  10. Machine Learning Algorithms for prediction of regions of high Reynolds Averaged Navier Stokes Uncertainty

    Science.gov (United States)

    Mishra, Aashwin; Iaccarino, Gianluca

    2017-11-01

    In spite of their deficiencies, RANS models represent the workhorse for industrial investigations into turbulent flows. In this context, it is essential to provide diagnostic measures to assess the quality of RANS predictions. To this end, the primary step is to identify feature importances amongst massive sets of potentially descriptive and discriminative flow features. This aids the physical interpretability of the resultant discrepancy model and its extensibility to similar problems. Recent investigations have utilized approaches such as Random Forests, Support Vector Machines and the Least Absolute Shrinkage and Selection Operator for feature selection. With examples, we exhibit how such methods may not be suitable for turbulent flow datasets. The underlying rationale, such as the correlation bias and the required conditions for the success of penalized algorithms, are discussed with illustrative examples. Finally, we provide alternate approaches using convex combinations of regularized regression approaches and randomized sub-sampling in combination with feature selection algorithms, to infer model structure from data. This research was supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).

  11. Frontal responses during learning predict vulnerability to the psychotogenic effects of ketamine : Linking cognition, brain activity, and psychosis

    NARCIS (Netherlands)

    Corlett, Philip R.; Honey, Garry D.; Aitken, Michael R. F.; Dickinson, Anthony; Shanks, David R.; Absalom, Anthony R.; Lee, Michael; Pomarol-Clotet, Edith; Murray, Graham K.; McKenna, Peter J.; Robbins, Trevor W.; Bullmore, Edward T.; Fletcher, Paul C.

    Context: Establishing a neurobiological account of delusion formation that links cognitive processes, brain activity, and symptoms is important to furthering our understanding of psychosis. Objective: To explore a theoretical model of delusion formation that implicates prediction error - dependent

  12. Soft sensor development for Mooney viscosity prediction in rubber mixing process based on GMMDJITGPR algorithm

    Science.gov (United States)

    Yang, Kai; Chen, Xiangguang; Wang, Li; Jin, Huaiping

    2017-01-01

    In rubber mixing process, the key parameter (Mooney viscosity), which is used to evaluate the property of the product, can only be obtained with 4-6h delay offline. It is quite helpful for the industry, if the parameter can be estimate on line. Various data driven soft sensors have been used to prediction in the rubber mixing. However, it always not functions well due to the phase and nonlinear property in the process. The purpose of this paper is to develop an efficient soft sensing algorithm to solve the problem. Based on the proposed GMMD local sample selecting criterion, the phase information is extracted in the local modeling. Using the Gaussian local modeling method within Just-in-time (JIT) learning framework, nonlinearity of the process is well handled. Efficiency of the new method is verified by comparing the performance with various mainstream soft sensors, using the samples from real industrial rubber mixing process.

  13. Design of artificial neural networks using a genetic algorithm to predict collection efficiency in venturi scrubbers.

    Science.gov (United States)

    Taheri, Mahboobeh; Mohebbi, Ali

    2008-08-30

    In this study, a new approach for the auto-design of neural networks, based on a genetic algorithm (GA), has been used to predict collection efficiency in venturi scrubbers. The experimental input data, including particle diameter, throat gas velocity, liquid to gas flow rate ratio, throat hydraulic diameter, pressure drop across the venturi scrubber and collection efficiency as an output, have been used to create a GA-artificial neural network (ANN) model. The testing results from the model are in good agreement with the experimental data. Comparison of the results of the GA optimized ANN model with the results from the trial-and-error calibrated ANN model indicates that the GA-ANN model is more efficient. Finally, the effects of operating parameters such as liquid to gas flow rate ratio, throat gas velocity, and particle diameter on collection efficiency were determined.

  14. XTALOPT version r11: An open-source evolutionary algorithm for crystal structure prediction

    Science.gov (United States)

    Avery, Patrick; Falls, Zackary; Zurek, Eva

    2018-01-01

    Version 11 of XTALOPT, an evolutionary algorithm for crystal structure prediction, has now been made available for download from the CPC library or the XTALOPT website, http://xtalopt.github.io. Whereas the previous versions of XTALOPT were published under the Gnu Public License (GPL), the current version is made available under the 3-Clause BSD License, which is an open source license that is recognized by the Open Source Initiative. Importantly, the new version can be executed via a command line interface (i.e., it does not require the use of a Graphical User Interface). Moreover, the new version is written as a stand-alone program, rather than an extension to AVOGADRO.

  15. Application of a genetic algorithm in the conformational analysis of methylene-acetal-linked thymine dimers in DNA: Comparison with distance geometry calculations

    International Nuclear Information System (INIS)

    Beckers, Mischa L.M.; Buydens, Lutgarde M.C.; Pikkemaat, Jeroen A.; Altona, Cornelis

    1997-01-01

    The three-dimensional spatial structure of a methylene-acetal-linked thymine dimer present in a 10 base-pair (bp) sense-antisense DNA duplex was studied with a genetic algorithm designed to interpret NOE distance restraints. Trial solutions were represented by torsion angles. This means that bond angles for the dimer trial structures are kept fixed during the genetic algorithm optimization. Bond angle values were extracted from a 10 bp sense-antisense duplex model that was subjected to energy minimization by means of a modified AMBER force field. A set of 63 proton-proton distance restraints defining the methylene-acetal-linked thymine dimer was available. The genetic algorithm minimizes the difference between distances in the trial structures and distance restraints. A large conformational search space could be covered in the genetic algorithm optimization by allowing a wide range of torsion angles. The genetic algorithm optimization in all cases led to one family of structures. This family of the methylene-acetal-linked thymine dimer in the duplex differs from the family that was suggested from distance geometry calculations. It is demonstrated that the bond angle geometry around the methylene-acetal linkage plays an important role in the optimization

  16. Demonstration of the use of ADAPT to derive predictive maintenance algorithms for the KSC central heat plant

    Science.gov (United States)

    Hunter, H. E.

    1972-01-01

    The Avco Data Analysis and Prediction Techniques (ADAPT) were employed to determine laws capable of detecting failures in a heat plant up to three days in advance of the occurrence of the failure. The projected performance of algorithms yielded a detection probability of 90% with false alarm rates of the order of 1 per year for a sample rate of 1 per day with each detection, followed by 3 hourly samplings. This performance was verified on 173 independent test cases. The program also demonstrated diagnostic algorithms and the ability to predict the time of failure to approximately plus or minus 8 hours up to three days in advance of the failure. The ADAPT programs produce simple algorithms which have a unique possibility of a relatively low cost updating procedure. The algorithms were implemented on general purpose computers at Kennedy Space Flight Center and tested against current data.

  17. A Clinical Prediction Algorithm to Stratify Pediatric Musculoskeletal Infection by Severity

    Science.gov (United States)

    Benvenuti, Michael A; An, Thomas J; Mignemi, Megan E; Martus, Jeffrey E; Mencio, Gregory A; Lovejoy, Stephen A; Thomsen, Isaac P; Schoenecker, Jonathan G; Williams, Derek J

    2016-01-01

    Objective There are currently no algorithms for early stratification of pediatric musculoskeletal infection (MSKI) severity that are applicable to all types of tissue involvement. In this study, the authors sought to develop a clinical prediction algorithm that accurately stratifies infection severity based on clinical and laboratory data at presentation to the emergency department. Methods An IRB-approved retrospective review was conducted to identify patients aged 0–18 who presented to the pediatric emergency department at a tertiary care children’s hospital with concern for acute MSKI over a five-year period (2008–2013). Qualifying records were reviewed to obtain clinical and laboratory data and to classify in-hospital outcomes using a three-tiered severity stratification system. Ordinal regression was used to estimate risk for each outcome. Candidate predictors included age, temperature, respiratory rate, heart rate, C-reactive protein, and peripheral white blood cell count. We fit fully specified (all predictors) and reduced models (retaining predictors with a p-value ≤ 0.2). Discriminatory power of the models was assessed using the concordance (c)-index. Results Of the 273 identified children, 191 (70%) met inclusion criteria. Median age was 5.8 years. Outcomes included 47 (25%) children with inflammation only, 41 (21%) with local infection, and 103 (54%) with disseminated infection. Both the full and reduced models accurately demonstrated excellent performance (full model c-index 0.83, 95% CI [0.79–0.88]; reduced model 0.83, 95% CI [0.78–0.87]). Model fit was also similar, indicating preference for the reduced model. Variables in this model included C-reactive protein, pulse, temperature, and an interaction term for pulse and temperature. The odds of a more severe outcome increased by 30% for every 10-unit increase in C-reactive protein. Conclusions Clinical and laboratory data obtained in the emergency department may be used to accurately

  18. PREDICTIVE CONTROL OF A BATCH POLYMERIZATION SYSTEM USING A FEEDFORWARD NEURAL NETWORK WITH ONLINE ADAPTATION BY GENETIC ALGORITHM

    OpenAIRE

    Cancelier, A.; Claumann, C. A.; Bolzan, A.; Machado, R. A. F.

    2016-01-01

    Abstract This study used a predictive controller based on an empirical nonlinear model comprising a three-layer feedforward neural network for temperature control of the suspension polymerization process. In addition to the offline training technique, an algorithm was also analyzed for online adaptation of its parameters. For the offline training, the network was statically trained and the genetic algorithm technique was used in combination with the least squares method. For online training, ...

  19. Predicting the onset of hazardous alcohol drinking in primary care: development and validation of a simple risk algorithm.

    Science.gov (United States)

    Bellón, Juan Ángel; de Dios Luna, Juan; King, Michael; Nazareth, Irwin; Motrico, Emma; GildeGómez-Barragán, María Josefa; Torres-González, Francisco; Montón-Franco, Carmen; Sánchez-Celaya, Marta; Díaz-Barreiros, Miguel Ángel; Vicens, Catalina; Moreno-Peral, Patricia

    2017-04-01

    Little is known about the risk of progressing to hazardous alcohol use in abstinent or low-risk drinkers. To develop and validate a simple brief risk algorithm for the onset of hazardous alcohol drinking (HAD) over 12 months for use in primary care. Prospective cohort study in 32 health centres from six Spanish provinces, with evaluations at baseline, 6 months, and 12 months. Forty-one risk factors were measured and multilevel logistic regression and inverse probability weighting were used to build the risk algorithm. The outcome was new occurrence of HAD during the study, as measured by the AUDIT. From the lists of 174 GPs, 3954 adult abstinent or low-risk drinkers were recruited. The 'predictAL-10' risk algorithm included just nine variables (10 questions): province, sex, age, cigarette consumption, perception of financial strain, having ever received treatment for an alcohol problem, childhood sexual abuse, AUDIT-C, and interaction AUDIT-C*Age. The c-index was 0.886 (95% CI = 0.854 to 0.918). The optimal cutoff had a sensitivity of 0.83 and specificity of 0.80. Excluding childhood sexual abuse from the model (the 'predictAL-9'), the c-index was 0.880 (95% CI = 0.847 to 0.913), sensitivity 0.79, and specificity 0.81. There was no statistically significant difference between the c-indexes of predictAL-10 and predictAL-9. The predictAL-10/9 is a simple and internally valid risk algorithm to predict the onset of hazardous alcohol drinking over 12 months in primary care attendees; it is a brief tool that is potentially useful for primary prevention of hazardous alcohol drinking. © British Journal of General Practice 2017.

  20. A Homogeneous and Self-Dual Interior-Point Linear Programming Algorithm for Economic Model Predictive Control

    DEFF Research Database (Denmark)

    Sokoler, Leo Emil; Frison, Gianluca; Skajaa, Anders

    2015-01-01

    We develop an efficient homogeneous and self-dual interior-point method (IPM) for the linear programs arising in economic model predictive control of constrained linear systems with linear objective functions. The algorithm is based on a Riccati iteration procedure, which is adapted to the linear...... system of equations solved in homogeneous and self-dual IPMs. Fast convergence is further achieved using a warm-start strategy. We implement the algorithm in MATLAB and C. Its performance is tested using a conceptual power management case study. Closed loop simulations show that 1) the proposed algorithm...

  1. Intermediate-term medium-range earthquake prediction algorithm M8: A new spatially stabilized application in Italy

    International Nuclear Information System (INIS)

    Romashkova, L.L.; Kossobokov, V.G.; Peresan, A.; Panza, G.F.

    2001-12-01

    A series of experiments, based on the intermediate-term earthquake prediction algorithm M8, has been performed for the retrospective simulation of forward predictions in the Italian territory, with the aim to design an experimental routine for real-time predictions. These experiments evidenced two main difficulties for the application of M8 in Italy. The first one is due to the fact that regional catalogues are usually limited in space. The second one concerns certain arbitrariness and instability, with respect to the positioning of the circles of investigation. Here we design a new scheme for the application of the algorithm M8, which is less subjective and less sensitive to the position of the circles of investigation. To perform this test, we consider a recent revision of the Italian catalogue, named UCI2001, composed by CCI1996, NEIC and ALPOR data for the period 1900-1985, and updated with the NEIC reduces the spatial heterogeneity of the data at the boundaries of Italy. The new variant of the M8 algorithm application reduces the number of spurious alarms and increases the reliability of predictions. As a result, three out of four earthquakes with magnitude M max larger than 6.0 are predicted in the retrospective simulation of the forward prediction, during the period 1972-2001, with a space-time volume of alarms comparable to that obtained with the non-stabilized variant of the M8 algorithm in Italy. (author)

  2. Predicting Student Academic Performance: A Comparison of Two Meta-Heuristic Algorithms Inspired by Cuckoo Birds for Training Neural Networks

    Directory of Open Access Journals (Sweden)

    Jeng-Fung Chen

    2014-10-01

    Full Text Available Predicting student academic performance with a high accuracy facilitates admission decisions and enhances educational services at educational institutions. This raises the need to propose a model that predicts student performance, based on the results of standardized exams, including university entrance exams, high school graduation exams, and other influential factors. In this study, an approach to the problem based on the artificial neural network (ANN with the two meta-heuristic algorithms inspired by cuckoo birds and their lifestyle, namely, Cuckoo Search (CS and Cuckoo Optimization Algorithm (COA is proposed. In particular, we used previous exam results and other factors, such as the location of the student’s high school and the student’s gender as input variables, and predicted the student academic performance. The standard CS and standard COA were separately utilized to train the feed-forward network for prediction. The algorithms optimized the weights between layers and biases of the neuron network. The simulation results were then discussed and analyzed to investigate the prediction ability of the neural network trained by these two algorithms. The findings demonstrated that both CS and COA have potential in training ANN and ANN-COA obtained slightly better results for predicting student academic performance in this case. It is expected that this work may be used to support student admission procedures and strengthen the service system in educational institutions.

  3. Long-term prediction of chaotic time series with multi-step prediction horizons by a neural network with Levenberg-Marquardt learning algorithm

    International Nuclear Information System (INIS)

    Mirzaee, Hossein

    2009-01-01

    The Levenberg-Marquardt learning algorithm is applied for training a multilayer perception with three hidden layer each with ten neurons in order to carefully map the structure of chaotic time series such as Mackey-Glass time series. First the MLP network is trained with 1000 data, and then it is tested with next 500 data. After that the trained and tested network is applied for long-term prediction of next 120 data which come after test data. The prediction is such a way that, the first inputs to network for prediction are the four last data of test data, then the predicted value is shifted to the regression vector which is the input to the network, then after first four-step of prediction, the input regression vector to network is fully predicted values and in continue, each predicted data is shifted to input vector for subsequent prediction.

  4. Prediction and optimization of thinning in automotive sealing cover using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Ganesh M. Kakandikar

    2016-01-01

    Full Text Available Deep drawing is a forming process in which a blank of sheet metal is radially drawn into a forming die by the mechanical action of a punch and converted to required shape. Deep drawing involves complex material flow conditions and force distributions. Radial drawing stresses and tangential compressive stresses are induced in flange region due to the material retention property. These compressive stresses result in wrinkling phenomenon in flange region. Normally blank holder is applied for restricting wrinkles. Tensile stresses in radial direction initiate thinning in the wall region of cup. The thinning results into cracking or fracture. The finite element method is widely applied worldwide to simulate the deep drawing process. For real-life simulations of deep drawing process an accurate numerical model, as well as an accurate description of material behavior and contact conditions, is necessary. The finite element method is a powerful tool to predict material thinning deformations before prototypes are made. The proposed innovative methodology combines two techniques for prediction and optimization of thinning in automotive sealing cover. Taguchi design of experiments and analysis of variance has been applied to analyze the influencing process parameters on Thinning. Mathematical relations have been developed to correlate input process parameters and Thinning. Optimization problem has been formulated for thinning and Genetic Algorithm has been applied for optimization. Experimental validation of results proves the applicability of newly proposed approach. The optimized component when manufactured is observed to be safe, no thinning or fracture is observed.

  5. Prediction of Flood Warning in Taiwan Using Nonlinear SVM with Simulated Annealing Algorithm

    Science.gov (United States)

    Lee, C.

    2013-12-01

    The issue of the floods is important in Taiwan. It is because the narrow and high topography of the island make lots of rivers steep in Taiwan. The tropical depression likes typhoon always causes rivers to flood. Prediction of river flow under the extreme rainfall circumstances is important for government to announce the warning of flood. Every time typhoon passed through Taiwan, there were always floods along some rivers. The warning is classified to three levels according to the warning water levels in Taiwan. The propose of this study is to predict the level of floods warning from the information of precipitation, rainfall duration and slope of riverbed. To classify the level of floods warning by the above-mentioned information and modeling the problems, a machine learning model, nonlinear Support vector machine (SVM), is formulated to classify the level of floods warning. In addition, simulated annealing (SA), a probabilistic heuristic algorithm, is used to determine the optimal parameter of the SVM model. A case study of flooding-trend rivers of different gradients in Taiwan is conducted. The contribution of this SVM model with simulated annealing is capable of making efficient announcement for flood warning and keeping the danger of flood from residents along the rivers.

  6. CN earthquake prediction algorithm and the monitoring of the future strong Vrancea events

    International Nuclear Information System (INIS)

    Moldoveanu, C.L.; Radulian, M.; Novikova, O.V.; Panza, G.F.

    2002-01-01

    The strong earthquakes originating at intermediate-depth in the Vrancea region (located in the SE corner of the highly bent Carpathian arc) represent one of the most important natural disasters able to induce heavy effects (high tool of casualties and extensive damage) in the Romanian territory. The occurrence of these earthquakes is irregular, but not infrequent. Their effects are felt over a large territory, from Central Europe to Moscow and from Greece to Scandinavia. The largest cultural and economical center exposed to the seismic risk due to the Vrancea earthquakes is Bucharest. This metropolitan area (230 km 2 wide) is characterized by the presence of 2.5 million inhabitants (10% of the country population) and by a considerable number of high-risk structures and infrastructures. The best way to face strong earthquakes is to mitigate the seismic risk by using the two possible complementary approaches represented by (a) the antiseismic design of structures and infrastructures (able to support strong earthquakes without significant damage), and (b) the strong earthquake prediction (in terms of alarm intervals declared for long, intermediate or short-term space-and time-windows). The intermediate term medium-range earthquake prediction represents the most realistic target to be reached at the present state of knowledge. The alarm declared in this case extends over a time window of about one year or more, and a space window of a few hundreds of kilometers. In the case of Vrancea events the spatial uncertainty is much less, being of about 100 km. The main measures for the mitigation of the seismic risk allowed by the intermediate-term medium-range prediction are: (a) verification of the buildings and infrastructures stability and reinforcement measures when required, (b) elaboration of emergency plans of action, (c) schedule of the main actions required in order to restore the normality of the social and economical life after the earthquake. The paper presents the

  7. Linking spring phenology with mechanistic models of host movement to predict disease transmission risk

    Science.gov (United States)

    Merkle, Jerod A.; Cross, Paul C.; Scurlock, Brandon M.; Cole, Eric K.; Courtemanch, Alyson B.; Dewey, Sarah R.; Kauffman, Matthew J.

    2018-01-01

    Disease models typically focus on temporal dynamics of infection, while often neglecting environmental processes that determine host movement. In many systems, however, temporal disease dynamics may be slow compared to the scale at which environmental conditions alter host space-use and accelerate disease transmission.Using a mechanistic movement modelling approach, we made space-use predictions of a mobile host (elk [Cervus Canadensis] carrying the bacterial disease brucellosis) under environmental conditions that change daily and annually (e.g., plant phenology, snow depth), and we used these predictions to infer how spring phenology influences the risk of brucellosis transmission from elk (through aborted foetuses) to livestock in the Greater Yellowstone Ecosystem.Using data from 288 female elk monitored with GPS collars, we fit step selection functions (SSFs) during the spring abortion season and then implemented a master equation approach to translate SSFs into predictions of daily elk distribution for five plausible winter weather scenarios (from a heavy snow, to an extreme winter drought year). We predicted abortion events by combining elk distributions with empirical estimates of daily abortion rates, spatially varying elk seroprevelance and elk population counts.Our results reveal strong spatial variation in disease transmission risk at daily and annual scales that is strongly governed by variation in host movement in response to spring phenology. For example, in comparison with an average snow year, years with early snowmelt are predicted to have 64% of the abortions occurring on feedgrounds shift to occurring on mainly public lands, and to a lesser extent on private lands.Synthesis and applications. Linking mechanistic models of host movement with disease dynamics leads to a novel bridge between movement and disease ecology. Our analysis framework offers new avenues for predicting disease spread, while providing managers tools to proactively mitigate

  8. An O(n(5)) algorithm for MFE prediction of kissing hairpins and 4-chains in nucleic acids.

    Science.gov (United States)

    Chen, Ho-Lin; Condon, Anne; Jabbari, Hosna

    2009-06-01

    Efficient methods for prediction of minimum free energy (MFE) nucleic secondary structures are widely used, both to better understand structure and function of biological RNAs and to design novel nano-structures. Here, we present a new algorithm for MFE secondary structure prediction, which significantly expands the class of structures that can be handled in O(n(5)) time. Our algorithm can handle H-type pseudoknotted structures, kissing hairpins, and chains of four overlapping stems, as well as nested substructures of these types.

  9. Short communication: Prediction of retention pay-off using a machine learning algorithm.

    Science.gov (United States)

    Shahinfar, Saleh; Kalantari, Afshin S; Cabrera, Victor; Weigel, Kent

    2014-05-01

    Replacement decisions have a major effect on dairy farm profitability. Dynamic programming (DP) has been widely studied to find the optimal replacement policies in dairy cattle. However, DP models are computationally intensive and might not be practical for daily decision making. Hence, the ability of applying machine learning on a prerun DP model to provide fast and accurate predictions of nonlinear and intercorrelated variables makes it an ideal methodology. Milk class (1 to 5), lactation number (1 to 9), month in milk (1 to 20), and month of pregnancy (0 to 9) were used to describe all cows in a herd in a DP model. Twenty-seven scenarios based on all combinations of 3 levels (base, 20% above, and 20% below) of milk production, milk price, and replacement cost were solved with the DP model, resulting in a data set of 122,716 records, each with a calculated retention pay-off (RPO). Then, a machine learning model tree algorithm was used to mimic the evaluated RPO with DP. The correlation coefficient factor was used to observe the concordance of RPO evaluated by DP and RPO predicted by the model tree. The obtained correlation coefficient was 0.991, with a corresponding value of 0.11 for relative absolute error. At least 100 instances were required per model constraint, resulting in 204 total equations (models). When these models were used for binary classification of positive and negative RPO, error rates were 1% false negatives and 9% false positives. Applying this trained model from simulated data for prediction of RPO for 102 actual replacement records from the University of Wisconsin-Madison dairy herd resulted in a 0.994 correlation with 0.10 relative absolute error rate. Overall results showed that model tree has a potential to be used in conjunction with DP to assist farmers in their replacement decisions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  10. Predicting recurrent aphthous ulceration using genetic algorithms-optimized neural networks

    Directory of Open Access Journals (Sweden)

    Najla S Dar-Odeh

    2010-05-01

    Full Text Available Najla S Dar-Odeh1, Othman M Alsmadi2, Faris Bakri3, Zaer Abu-Hammour2, Asem A Shehabi3, Mahmoud K Al-Omiri1, Shatha M K Abu-Hammad4, Hamzeh Al-Mashni4, Mohammad B Saeed4, Wael Muqbil4, Osama A Abu-Hammad1 1Faculty of Dentistry, 2Faculty of Engineering and Technology, 3Faculty of Medicine, University of Jordan, Amman, Jordan; 4Dental Department, University of Jordan Hospital, Amman, JordanObjective: To construct and optimize a neural network that is capable of predicting the occurrence of recurrent aphthous ulceration (RAU based on a set of appropriate input data.Participants and methods: Artificial neural networks (ANN software employing genetic algorithms to optimize the architecture neural networks was used. Input and output data of 86 participants (predisposing factors and status of the participants with regards to recurrent aphthous ulceration were used to construct and train the neural networks. The optimized neural networks were then tested using untrained data of a further 10 participants.Results: The optimized neural network, which produced the most accurate predictions for the presence or absence of recurrent aphthous ulceration was found to employ: gender, hematological (with or without ferritin and mycological data of the participants, frequency of tooth brushing, and consumption of vegetables and fruits.Conclusions: Factors appearing to be related to recurrent aphthous ulceration and appropriate for use as input data to construct ANNs that predict recurrent aphthous ulceration were found to include the following: gender, hemoglobin, serum vitamin B12, serum ferritin, red cell folate, salivary candidal colony count, frequency of tooth brushing, and the number of fruits or vegetables consumed daily.Keywords: artifical neural networks, recurrent, aphthous ulceration, ulcer

  11. DEVELOPMENT OF THE SOCIAL TENSION RISK PREDICTING ALGORITHM IN THE POPULATION OF CERTAIN REGIONS OF RUSSIA

    Directory of Open Access Journals (Sweden)

    A. B. Mulik

    2017-01-01

    Full Text Available Aim. The aim of the study was development of approaches to predict the risk of social tension for population of the Russian Federation regions.Methods. Theoretical studies based on the analysis of cartographic material from the National Atlas of Russia. The use of geo-information technologies has provided modeling of environmental load in the territory of certain regions of Russia. Experimental studies were performed using standard methods of psycho-physiological testing involving 336 persons 18-23 years old of both sexes.Results. As a fundamental biologically significant factor of the environment, differentiating the Russian Federation territory to areas with discrete actual physical effects, total solar radiation was determined. The subsequent allocation of model regions (Republic of Crimea, Rostov and Saratov regions based on the principle of minimizing regional differences associated factors of environmental pressure per person. Experimental studies have revealed persistent systemic relationships of phenotypic characteristics and tendency of person to neuropsychic tension. The risk of social tension for the study area population is predicted on the condition of finding more than two thirds of the representatives of sample within the borders of a high level of general non-specific reactivity of an organism.Main conclusions. The expediency of using the northern latitude as an integral index of differentiation of areas on the specifics of the severity of the physical factors of environmental impact on human activity is justified. The possibility of the application for the level of general nonspecific reactivity of an organism as a phenotypic trait marker of social tension risk is identified. An algorithm for predicting the risk of social tension among the population, compactly living in certain territories of the Russian Federation is designed. 

  12. A Fine-Grained API Link Prediction Approach Supporting CMDA Mashup Recommendation

    Science.gov (United States)

    Zhang, J.; Bao, Q.; Lee, T. J.; Ramachandran, R.; Lee, S.; Pan, L.; Gatlin, P. N.; Maskey, M.

    2017-12-01

    Service (API) discovery and recommendation is key to the wide spread of service oriented architecture and service oriented software engineering. Service recommendation typically relies on service linkage prediction calculated by the semantic distances (or similarities) among services based on their collection of inherent attributes. Given a specific context (mashup goal), however, different attributes may contribute differently to a service linkage. In this work, instead of training a model for all attributes as a whole, a novel approach is presented to simultaneously train separate models for individual attributes. Our contributions are summarized in three-fold. First is that we have developed a scalable attribute-level data model, featuring scalability and extensibility. We have extended Multiplicative Attribute Graph (MAG) model to represent node profiles featuring rich categorical attributes, while relaxing its constraint of requiring a priori knowledge of predefined attributes. LDA is leveraged to dynamically identify attributes based on attribute modeling, and multiple Gaussian fit is applied to find global optimal values. The second contribution is that we have seamlessly integrated the latent relationships between API attributes as well as observed network structure based on historical API usage data. Such a layered information model enables us to predict the probability of a link between two APIs based on their attribute link affinities carrying a variety of information including meta data, semantic data, historical usage data, as well as crowdsourcing user comments and annotations. The third contribution is that we have developed a finegrained context-aware mashup-API recommendation technique. On top of individual models trained for separate attributes, a dedicated layer is trained to represent the latent attribute distribution regarding mashup purpose, i.e., sensitivity of attributes to context. Thus, given the description of an intended mashup, the

  13. A grand canonical genetic algorithm for the prediction of multi-component phase diagrams and testing of empirical potentials

    International Nuclear Information System (INIS)

    Tipton, William W; Hennig, Richard G

    2013-01-01

    We present an evolutionary algorithm which predicts stable atomic structures and phase diagrams by searching the energy landscape of empirical and ab initio Hamiltonians. Composition and geometrical degrees of freedom may be varied simultaneously. We show that this method utilizes information from favorable local structure at one composition to predict that at others, achieving far greater efficiency of phase diagram prediction than a method which relies on sampling compositions individually. We detail this and a number of other efficiency-improving techniques implemented in the genetic algorithm for structure prediction code that is now publicly available. We test the efficiency of the software by searching the ternary Zr–Cu–Al system using an empirical embedded-atom model potential. In addition to testing the algorithm, we also evaluate the accuracy of the potential itself. We find that the potential stabilizes several correct ternary phases, while a few of the predicted ground states are unphysical. Our results suggest that genetic algorithm searches can be used to improve the methodology of empirical potential design. (paper)

  14. A grand canonical genetic algorithm for the prediction of multi-component phase diagrams and testing of empirical potentials.

    Science.gov (United States)

    Tipton, William W; Hennig, Richard G

    2013-12-11

    We present an evolutionary algorithm which predicts stable atomic structures and phase diagrams by searching the energy landscape of empirical and ab initio Hamiltonians. Composition and geometrical degrees of freedom may be varied simultaneously. We show that this method utilizes information from favorable local structure at one composition to predict that at others, achieving far greater efficiency of phase diagram prediction than a method which relies on sampling compositions individually. We detail this and a number of other efficiency-improving techniques implemented in the genetic algorithm for structure prediction code that is now publicly available. We test the efficiency of the software by searching the ternary Zr-Cu-Al system using an empirical embedded-atom model potential. In addition to testing the algorithm, we also evaluate the accuracy of the potential itself. We find that the potential stabilizes several correct ternary phases, while a few of the predicted ground states are unphysical. Our results suggest that genetic algorithm searches can be used to improve the methodology of empirical potential design.

  15. Runoff prediction using rainfall data from microwave links: Tabor case study.

    Science.gov (United States)

    Stransky, David; Fencl, Martin; Bares, Vojtech

    2018-05-01

    Rainfall spatio-temporal distribution is of great concern for rainfall-runoff modellers. Standard rainfall observations are, however, often scarce and/or expensive to obtain. Thus, rainfall observations from non-traditional sensors such as commercial microwave links (CMLs) represent a promising alternative. In this paper, rainfall observations from a municipal rain gauge (RG) monitoring network were complemented by CMLs and used as an input to a standard urban drainage model operated by the water utility of the Tabor agglomeration (CZ). Two rainfall datasets were used for runoff predictions: (i) the municipal RG network, i.e. the observation layout used by the water utility, and (ii) CMLs adjusted by the municipal RGs. The performance was evaluated in terms of runoff volumes and hydrograph shapes. The use of CMLs did not lead to distinctively better predictions in terms of runoff volumes; however, CMLs outperformed RGs used alone when reproducing a hydrograph's dynamics (peak discharges, Nash-Sutcliffe coefficient and hydrograph's rising limb timing). This finding is promising for number of urban drainage tasks working with dynamics of the flow. Moreover, CML data can be obtained from a telecommunication operator's data cloud at virtually no cost. That makes their use attractive for cities unable to improve their monitoring infrastructure for economic or organizational reasons.

  16. Intrinsic disorder in Viral Proteins Genome-Linked: experimental and predictive analyses

    Directory of Open Access Journals (Sweden)

    Van Dorsselaer Alain

    2009-02-01

    Full Text Available Abstract Background VPgs are viral proteins linked to the 5' end of some viral genomes. Interactions between several VPgs and eukaryotic translation initiation factors eIF4Es are critical for plant infection. However, VPgs are not restricted to phytoviruses, being also involved in genome replication and protein translation of several animal viruses. To date, structural data are still limited to small picornaviral VPgs. Recently three phytoviral VPgs were shown to be natively unfolded proteins. Results In this paper, we report the bacterial expression, purification and biochemical characterization of two phytoviral VPgs, namely the VPgs of Rice yellow mottle virus (RYMV, genus Sobemovirus and Lettuce mosaic virus (LMV, genus Potyvirus. Using far-UV circular dichroism and size exclusion chromatography, we show that RYMV and LMV VPgs are predominantly or partly unstructured in solution, respectively. Using several disorder predictors, we show that both proteins are predicted to possess disordered regions. We next extend theses results to 14 VPgs representative of the viral diversity. Disordered regions were predicted in all VPg sequences whatever the genus and the family. Conclusion Based on these results, we propose that intrinsic disorder is a common feature of VPgs. The functional role of intrinsic disorder is discussed in light of the biological roles of VPgs.

  17. Linking macroecology and community ecology: refining predictions of species distributions using biotic interaction networks.

    Science.gov (United States)

    Staniczenko, Phillip P A; Sivasubramaniam, Prabu; Suttle, K Blake; Pearson, Richard G

    2017-06-01

    Macroecological models for predicting species distributions usually only include abiotic environmental conditions as explanatory variables, despite knowledge from community ecology that all species are linked to other species through biotic interactions. This disconnect is largely due to the different spatial scales considered by the two sub-disciplines: macroecologists study patterns at large extents and coarse resolutions, while community ecologists focus on small extents and fine resolutions. A general framework for including biotic interactions in macroecological models would help bridge this divide, as it would allow for rigorous testing of the role that biotic interactions play in determining species ranges. Here, we present an approach that combines species distribution models with Bayesian networks, which enables the direct and indirect effects of biotic interactions to be modelled as propagating conditional dependencies among species' presences. We show that including biotic interactions in distribution models for species from a California grassland community results in better range predictions across the western USA. This new approach will be important for improving estimates of species distributions and their dynamics under environmental change. © 2017 The Authors. Ecology Letters published by CNRS and John Wiley & Sons Ltd.

  18. Positive predictive value of a register-based algorithm using the Danish National Registries to identify suicidal events.

    Science.gov (United States)

    Gasse, Christiane; Danielsen, Andreas Aalkjaer; Pedersen, Marianne Giørtz; Pedersen, Carsten Bøcker; Mors, Ole; Christensen, Jakob

    2018-04-17

    It is not possible to fully assess intention of self-harm and suicidal events using information from administrative databases. We conducted a validation study of intention of suicide attempts/self-harm contacts identified by a commonly applied Danish register-based algorithm (DK-algorithm) based on hospital discharge diagnosis and emergency room contacts. Of all 101 530 people identified with an incident suicide attempt/self-harm contact at Danish hospitals between 1995 and 2012 using the DK-algorithm, we selected a random sample of 475 people. We validated the DK-algorithm against medical records applying the definitions and terminology of the Columbia Classification Algorithm of Suicide Assessment of suicidal events, nonsuicidal events, and indeterminate or potentially suicidal events. We calculated positive predictive values (PPVs) of the DK-algorithm to identify suicidal events overall, by gender, age groups, and calendar time. We retrieved medical records for 357 (75%) people. The PPV of the DK-algorithm to identify suicidal events was 51.5% (95% CI: 46.4-56.7) overall, 42.7% (95% CI: 35.2-50.5) in males, and 58.5% (95% CI: 51.6-65.1) in females. The PPV varied further across age groups and calendar time. After excluding cases identified via the DK-algorithm by unspecific codes of intoxications and injury, the PPV improved slightly (56.8% [95% CI: 50.0-63.4]). The DK-algorithm can reliably identify self-harm with suicidal intention in 52% of the identified cases of suicide attempts/self-harm. The PPVs could be used for quantitative bias analysis and implemented as weights in future studies to estimate the proportion of suicidal events among cases identified via the DK-algorithm. Copyright © 2018 John Wiley & Sons, Ltd.

  19. SU-C-BRF-07: A Pattern Fusion Algorithm for Multi-Step Ahead Prediction of Surrogate Motion

    International Nuclear Information System (INIS)

    Zawisza, I; Yan, H; Yin, F

    2014-01-01

    Purpose: To assure that tumor motion is within the radiation field during high-dose and high-precision radiosurgery, real-time imaging and surrogate monitoring are employed. These methods are useful in providing real-time tumor/surrogate motion but no future information is available. In order to anticipate future tumor/surrogate motion and track target location precisely, an algorithm is developed and investigated for estimating surrogate motion multiple-steps ahead. Methods: The study utilized a one-dimensional surrogate motion signal divided into three components: (a) training component containing the primary data including the first frame to the beginning of the input subsequence; (b) input subsequence component of the surrogate signal used as input to the prediction algorithm: (c) output subsequence component is the remaining signal used as the known output of the prediction algorithm for validation. The prediction algorithm consists of three major steps: (1) extracting subsequences from training component which best-match the input subsequence according to given criterion; (2) calculating weighting factors from these best-matched subsequence; (3) collecting the proceeding parts of the subsequences and combining them together with assigned weighting factors to form output. The prediction algorithm was examined for several patients, and its performance is assessed based on the correlation between prediction and known output. Results: Respiratory motion data was collected for 20 patients using the RPM system. The output subsequence is the last 50 samples (∼2 seconds) of a surrogate signal, and the input subsequence was 100 (∼3 seconds) frames prior to the output subsequence. Based on the analysis of correlation coefficient between predicted and known output subsequence, the average correlation is 0.9644±0.0394 and 0.9789±0.0239 for equal-weighting and relative-weighting strategies, respectively. Conclusion: Preliminary results indicate that the prediction

  20. Hyperparameter Optimization of Artificial Neural Network in Customer Churn Prediction using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Martin Fridrich

    2017-06-01

    Full Text Available Purpose of the article: The ability of the company to predict customer churn and retain customers is considered to be worthy competitive advantage since it improves cost allocation in customer retention programs, retaining future revenue and profits. In addition, it has several positive indirect impacts such as increasing customer’s loyalty. Therefore, the focus of the article is on building highly reliable and robust classification model, which deals with such a task. Methodology/methods: The analysis is carried out on labelled ecommerce retail dataset describing 10 000 most valuable customers with the highest CLV (Customer Lifetime Value. To obtain the best performing ANN (Artificial Neural Network classification model, proposed hyperparameter search space is explored with genetic algorithm to find suitable parameter settings. ANN classification performance is measured with regard to prediction ability, which is understood as point estimate of AUC (Area Under Curve mean on 4fold cross-validation set. Explored part of hyperparameter search space is analyzed with conditional inference tree structure addressing underlying fundamental context of given optimization which results in identification of critical factors leading to well performing ANN classification model. Scientific aim: To present and execute experimental design for performance evaluation and hyperparameter optimization of classification models, which are used for customer churn prediction. Findings: It is concluded and statistically proven that in experimental context described, regularization parameter as well as training function have significant influence on classifiers AUC performance contrasting other properties of ANN. More specifically, well performing ANN classification models have regularization parameter set to 0, adaptation function set to trainlm or trainscg and more than 100 training epochs. Global optimum is identified for solution with regularization parameter set to

  1. Capacitance estimation algorithm based on DC-link voltage harmonics using artificial neural network in three-phase motor drive systems

    DEFF Research Database (Denmark)

    Soliman, Hammam Abdelaal Hammam; Davari, Pooya; Wang, Huai

    2017-01-01

    to industry. In this digest, a condition monitoring methodology that estimates the capacitance value of the dc-link capacitor in a three phase Front-End diode bridge motor drive is proposed. The proposed software methodology is based on Artificial Neural Network (ANN) algorithm. The harmonics of the dc......-link voltage are used as training data to the Artificial Neural Network. Fast Fourier Transform (FFT) of the dc-link voltage is analysed in order to study the impact of capacitance variation on the harmonics order. Laboratory experiments are conducted to validate the proposed methodology and the error analysis......In modern design of power electronic converters, reliability of dc-link capacitors is one of the critical considered aspects. The industrial field have been attracted to the monitoring of their health condition and the estimation of their ageing process status. However, the existing condition...

  2. Predicting the severity of nuclear power plant transients using nearest neighbors modeling optimized by genetic algorithms on a parallel computer

    International Nuclear Information System (INIS)

    Lin, J.; Bartal, Y.; Uhrig, R.E.

    1995-01-01

    The importance of automatic diagnostic systems for nuclear power plants (NPPs) has been discussed in numerous studies, and various such systems have been proposed. None of those systems were designed to predict the severity of the diagnosed scenario. A classification and severity prediction system for NPP transients is developed. The system is based on nearest neighbors modeling, which is optimized using genetic algorithms. The optimization process is used to determine the most important variables for each of the transient types analyzed. An enhanced version of the genetic algorithms is used in which a local downhill search is performed to further increase the accuracy achieved. The genetic algorithms search was implemented on a massively parallel supercomputer, the KSR1-64, to perform the analysis in a reasonable time. The data for this study were supplied by the high-fidelity simulator of the San Onofre unit 1 pressurized water reactor

  3. Machine Learning Algorithms For Predicting the Instability Timescales of Compact Planetary Systems

    Science.gov (United States)

    Tamayo, Daniel; Ali-Dib, Mohamad; Cloutier, Ryan; Huang, Chelsea; Van Laerhoven, Christa L.; Leblanc, Rejean; Menou, Kristen; Murray, Norman; Obertas, Alysa; Paradise, Adiv; Petrovich, Cristobal; Rachkov, Aleksandar; Rein, Hanno; Silburt, Ari; Tacik, Nick; Valencia, Diana

    2016-10-01

    The Kepler mission has uncovered hundreds of compact multi-planet systems. The dynamical pathways to instability in these compact systems and their associated timescales are not well understood theoretically. However, long-term stability is often used as a constraint to narrow down the space of orbital solutions from the transit data. This requires a large suite of N-body integrations that can each take several weeks to complete. This computational bottleneck is therefore an important limitation in our ability to characterize compact multi-planet systems.From suites of numerical simulations, previous studies have fit simple scaling relations between the instability timescale and various system parameters. However, the numerically simulated systems can deviate strongly from these empirical fits.We present a new approach to the problem using machine learning algorithms that have enjoyed success across a broad range of high-dimensional industry applications. In particular, we have generated large training sets of direct N-body integrations of synthetic compact planetary systems to train several regression models (support vector machine, gradient boost) that predict the instability timescale. We find that ensembling these models predicts the instability timescale of planetary systems better than previous approaches using the simple scaling relations mentioned above.Finally, we will discuss how these models provide a powerful tool for not only understanding the current Kepler multi-planet sample, but also for characterizing and shaping the radial-velocity follow-up strategies of multi-planet systems from the upcoming Transiting Exoplanet Survey Satellite (TESS) mission, given its shorter observation baselines.

  4. Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

    Directory of Open Access Journals (Sweden)

    Xin Yi Ng

    2015-01-01

    Full Text Available This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM- LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity.

  5. PREDICTION BASED CHANNEL-HOPPING ALGORITHM FOR RENDEZVOUS IN COGNITIVE RADIO NETWORKS

    Directory of Open Access Journals (Sweden)

    Dhananjay Kumar

    2012-12-01

    Full Text Available Most common works for rendezvous in cognitive radio networks deal only with two user scenarios involving two secondary users and variable primary users and aim at reducing the time-to-rendezvous. A common control channel for the establishment of communication is not considered and hence the work comes under the category of ‘Blind Rendezvous’. Our work deal with multi-user scenario and provides a methodology for the users to find each other in the very first time slot spent for rendezvous or otherwise called the firstattempt- rendezvous. The secondary users make use of the history of past communications to enable them to predict the frequency channel that the user expects the rendezvous user to be. Our approach prevents greedy decision making between the users involved by the use of a cut-off time period for attempting rendezvous. Simulation results show that the time-to-rendezvous (TTR is greatly reduced upon comparison with other popular rendezvous algorithms.

  6. Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction

    Directory of Open Access Journals (Sweden)

    Mahmood A. Rashid

    2013-01-01

    Full Text Available Protein structure prediction (PSP is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution 20×20 energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results.

  7. BP neural network optimized by genetic algorithm approach for titanium and iron content prediction in EDXRF

    International Nuclear Information System (INIS)

    Wang Jun; Liu Mingzhe; Li Zhe; Li Lei; Shi Rui; Tuo Xianguo

    2015-01-01

    The quantitative elemental content analysis is difficult due to the uniform effect, particle effect and the element matrix effect, etc, when using energy dispersive X-ray fluorescence (EDXRF) technique. In this paper, a hybrid approach of genetic algorithm (GA) and back propagation (BP) neural network was proposed without considering the complex relationship between the concentration and intensity. The aim of GA optimized BP was to get better network initial weights and thresholds. The basic idea was that the reciprocal of the mean square error of the initialization BP neural network was set as the fitness value of the individual in GA, and the initial weights and thresholds were replaced by individuals, and then the optimal individual was sought by selection, crossover and mutation operations, finally a new BP neural network model was created with the optimal initial weights and thresholds. The calculation results of quantitative analysis of titanium and iron contents for five types of ore bodies in Panzhihua Mine show that the results of classification prediction are far better than that of overall forecasting, and relative errors of 76.7% samples are less than 2% compared with chemical analysis values, which demonstrates the effectiveness of the proposed method. (authors)

  8. Prediction of Drug-Plasma Protein Binding Using Artificial Intelligence Based Algorithms.

    Science.gov (United States)

    Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

    2018-01-01

    Plasma protein binding (PPB) has vital importance in the characterization of drug distribution in the systemic circulation. Unfavorable PPB can pose a negative effect on clinical development of promising drug candidates. The drug distribution properties should be considered at the initial phases of the drug design and development. Therefore, PPB prediction models are receiving an increased attention. In the current study, we present a systematic approach using Support vector machine, Artificial neural network, k- nearest neighbor, Probabilistic neural network, Partial least square and Linear discriminant analysis to relate various in vitro and in silico molecular descriptors to a diverse dataset of 736 drugs/drug-like compounds. The overall accuracy of Support vector machine with Radial basis function kernel came out to be comparatively better than the rest of the applied algorithms. The training set accuracy, validation set accuracy, precision, sensitivity, specificity and F1 score for the Suprort vector machine was found to be 89.73%, 89.97%, 92.56%, 87.26%, 91.97% and 0.898, respectively. This model can potentially be useful in screening of relevant drug candidates at the preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. Proposed prediction algorithms based on hybrid approach to deal with anomalies of RFID data in healthcare

    Directory of Open Access Journals (Sweden)

    A. Anny Leema

    2013-07-01

    Full Text Available The RFID technology has penetrated the healthcare sector due to its increased functionality, low cost, high reliability, and easy-to-use capabilities. It is being deployed for various applications and the data captured by RFID readers increase according to timestamp resulting in an enormous volume of data duplication, false positive, and false negative. The dirty data stream generated by the RFID readers is one of the main factors limiting the widespread adoption of RFID technology. In order to provide reliable data to RFID application, it is necessary to clean the collected data and this should be done in an effective manner before they are subjected to warehousing. The existing approaches to deal with anomalies are physical, middleware, and deferred approach. The shortcomings of existing approaches are analyzed and found that robust RFID system can be built by integrating the middleware and deferred approach. Our proposed algorithms based on hybrid approach are tested in the healthcare environment which predicts false positive, false negative, and redundant data. In this paper, healthcare environment is simulated using RFID and the data observed by RFID reader consist of anomalies false positive, false negative, and duplication. Experimental evaluation shows that our cleansing methods remove errors in RFID data more accurately and efficiently. Thus, with the aid of the planned data cleaning technique, we can bring down the healthcare costs, optimize business processes, streamline patient identification processes, and improve patient safety.

  10. Genetic Algorithms for Estimating Effective Parameters in a Lumped Reactor Model for Reactivity Predictions

    International Nuclear Information System (INIS)

    Marseguerra, Marzio; Zio, Enrico

    2001-01-01

    The control system of a reactor should be able to predict, in real time, the amount of reactivity to be inserted (e.g., by control rod movements and boron injection and dilution) to respond to a given electrical load demand or to undesired, accidental transients. The real-time constraint renders impractical the use of a large, detailed dynamic reactor code. One has, then, to resort to simplified analytical models with lumped effective parameters suitably estimated from the reactor data.The simple and well-known Chernick model for describing the reactor power evolution in the presence of xenon is considered and the feasibility of using genetic algorithms for estimating the effective nuclear parameters involved and the initial nonmeasurable xenon and iodine conditions is investigated. This approach has the advantage of counterbalancing the inherent model simplicity with the periodic reestimation of the effective parameter values pertaining to each reactor on the basis of its recent history. By so doing, other effects, such as burnup, are automatically taken into account

  11. Cost and mortality impact of an algorithm-driven sepsis prediction system.

    Science.gov (United States)

    Calvert, Jacob; Hoffman, Jana; Barton, Christopher; Shimabukuro, David; Ries, Michael; Chettipally, Uli; Kerem, Yaniv; Jay, Melissa; Mataraso, Samson; Das, Ritankar

    2017-06-01

    To compute the financial and mortality impact of InSight, an algorithm-driven biomarker, which forecasts the onset of sepsis with minimal use of electronic health record data. This study compares InSight with existing sepsis screening tools and computes the differential life and cost savings associated with its use in the inpatient setting. To do so, mortality reduction is obtained from an increase in the number of sepsis cases correctly identified by InSight. Early sepsis detection by InSight is also associated with a reduction in length-of-stay, from which cost savings are directly computed. InSight identifies more true positive cases of severe sepsis, with fewer false alarms, than comparable methods. For an individual ICU with 50 beds, for example, it is determined that InSight annually saves 75 additional lives and reduces sepsis-related costs by $560,000. InSight performance results are derived from analysis of a single-center cohort. Mortality reduction results rely on a simplified use case, which fixes prediction times at 0, 1, and 2 h before sepsis onset, likely leading to under-estimates of lives saved. The corresponding cost reduction numbers are based on national averages for daily patient length-of-stay cost. InSight has the potential to reduce sepsis-related deaths and to lead to substantial cost savings for healthcare facilities.

  12. From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks.

    KAUST Repository

    Cannistraci, C.V.

    2013-04-08

    Growth and remodelling impact the network topology of complex systems, yet a general theory explaining how new links arise between existing nodes has been lacking, and little is known about the topological properties that facilitate link-prediction. Here we investigate the extent to which the connectivity evolution of a network might be predicted by mere topological features. We show how a link/community-based strategy triggers substantial prediction improvements because it accounts for the singular topology of several real networks organised in multiple local communities - a tendency here named local-community-paradigm (LCP). We observe that LCP networks are mainly formed by weak interactions and characterise heterogeneous and dynamic systems that use self-organisation as a major adaptation strategy. These systems seem designed for global delivery of information and processing via multiple local modules. Conversely, non-LCP networks have steady architectures formed by strong interactions, and seem designed for systems in which information/energy storage is crucial.

  13. Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

    Science.gov (United States)

    Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

    2018-01-01

    The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.

  14. ANFIS Based Time Series Prediction Method of Bank Cash Flow Optimized by Adaptive Population Activity PSO Algorithm

    Directory of Open Access Journals (Sweden)

    Jie-Sheng Wang

    2015-06-01

    Full Text Available In order to improve the accuracy and real-time of all kinds of information in the cash business, and solve the problem which accuracy and stability is not high of the data linkage between cash inventory forecasting and cash management information in the commercial bank, a hybrid learning algorithm is proposed based on adaptive population activity particle swarm optimization (APAPSO algorithm combined with the least squares method (LMS to optimize the adaptive network-based fuzzy inference system (ANFIS model parameters. Through the introduction of metric function of population diversity to ensure the diversity of population and adaptive changes in inertia weight and learning factors, the optimization ability of the particle swarm optimization (PSO algorithm is improved, which avoids the premature convergence problem of the PSO algorithm. The simulation comparison experiments are carried out with BP-LMS algorithm and standard PSO-LMS by adopting real commercial banks’ cash flow data to verify the effectiveness of the proposed time series prediction of bank cash flow based on improved PSO-ANFIS optimization method. Simulation results show that the optimization speed is faster and the prediction accuracy is higher.

  15. Observational study to calculate addictive risk to opioids: a validation study of a predictive algorithm to evaluate opioid use disorder

    Directory of Open Access Journals (Sweden)

    Brenton A

    2017-05-01

    Full Text Available Ashley Brenton,1 Steven Richeimer,2,3 Maneesh Sharma,4 Chee Lee,1 Svetlana Kantorovich,1 John Blanchard,1 Brian Meshkin1 1Proove Biosciences, Irvine, CA, 2Keck school of Medicine, University of Southern California, Los Angeles, CA, 3Departments of Anesthesiology and Psychiatry, University of Southern California, Los Angeles, CA, 4Interventional Pain Institute, Baltimore, MD, USA Background: Opioid abuse in chronic pain patients is a major public health issue, with rapidly increasing addiction rates and deaths from unintentional overdose more than quadrupling since 1999. Purpose: This study seeks to determine the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm incorporating phenotypic risk factors and neuroscience-associated single-nucleotide polymorphisms (SNPs. Patients and methods: The Proove Opioid Risk (POR algorithm determines the predictability of aberrant behavior to opioids using a comprehensive scoring algorithm incorporating phenotypic risk factors and neuroscience-associated SNPs. In a validation study with 258 subjects with diagnosed opioid use disorder (OUD and 650 controls who reported using opioids, the POR successfully categorized patients at high and moderate risks of opioid misuse or abuse with 95.7% sensitivity. Regardless of changes in the prevalence of opioid misuse or abuse, the sensitivity of POR remained >95%. Conclusion: The POR correctly stratifies patients into low-, moderate-, and high-risk categories to appropriately identify patients at need for additional guidance, monitoring, or treatment changes. Keywords: opioid use disorder, addiction, personalized medicine, pharmacogenetics, genetic testing, predictive algorithm

  16. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

    Science.gov (United States)

    Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

    2017-07-01

    Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a

  17. Development and validation of a prediction algorithm for the onset of common mental disorders in a working population.

    Science.gov (United States)

    Fernandez, Ana; Salvador-Carulla, Luis; Choi, Isabella; Calvo, Rafael; Harvey, Samuel B; Glozier, Nicholas

    2018-01-01

    Common mental disorders are the most common reason for long-term sickness absence in most developed countries. Prediction algorithms for the onset of common mental disorders may help target indicated work-based prevention interventions. We aimed to develop and validate a risk algorithm to predict the onset of common mental disorders at 12 months in a working population. We conducted a secondary analysis of the Household, Income and Labour Dynamics in Australia Survey, a longitudinal, nationally representative household panel in Australia. Data from the 6189 working participants who did not meet the criteria for a common mental disorders at baseline were non-randomly split into training and validation databases, based on state of residence. Common mental disorders were assessed with the mental component score of 36-Item Short Form Health Survey questionnaire (score ⩽45). Risk algorithms were constructed following recommendations made by the Transparent Reporting of a multivariable prediction model for Prevention Or Diagnosis statement. Different risk factors were identified among women and men for the final risk algorithms. In the training data, the model for women had a C-index of 0.73 and effect size (Hedges' g) of 0.91. In men, the C-index was 0.76 and the effect size was 1.06. In the validation data, the C-index was 0.66 for women and 0.73 for men, with positive predictive values of 0.28 and 0.26, respectively Conclusion: It is possible to develop an algorithm with good discrimination for the onset identifying overall and modifiable risks of common mental disorders among working men. Such models have the potential to change the way that prevention of common mental disorders at the workplace is conducted, but different models may be required for women.

  18. NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction

    Directory of Open Access Journals (Sweden)

    Lund Ole

    2009-09-01

    Full Text Available Abstract Background The major histocompatibility complex (MHC molecule plays a central role in controlling the adaptive immune response to infections. MHC class I molecules present peptides derived from intracellular proteins to cytotoxic T cells, whereas MHC class II molecules stimulate cellular and humoral immunity through presentation of extracellularly derived peptides to helper T cells. Identification of which peptides will bind a given MHC molecule is thus of great importance for the understanding of host-pathogen interactions, and large efforts have been placed in developing algorithms capable of predicting this binding event. Results Here, we present a novel artificial neural network-based method, NN-align that allows for simultaneous identification of the MHC class II binding core and binding affinity. NN-align is trained using a novel training algorithm that allows for correction of bias in the training data due to redundant binding core representation. Incorporation of information about the residues flanking the peptide-binding core is shown to significantly improve the prediction accuracy. The method is evaluated on a large-scale benchmark consisting of six independent data sets covering 14 human MHC class II alleles, and is demonstrated to outperform other state-of-the-art MHC class II prediction methods. Conclusion The NN-align method is competitive with the state-of-the-art MHC class II peptide binding prediction algorithms. The method is publicly available at http://www.cbs.dtu.dk/services/NetMHCII-2.0.

  19. Accuracy assessment of pharmacogenetically predictive warfarin dosing algorithms in patients of an academic medical center anticoagulation clinic.

    Science.gov (United States)

    Shaw, Paul B; Donovan, Jennifer L; Tran, Maichi T; Lemon, Stephenie C; Burgwinkle, Pamela; Gore, Joel

    2010-08-01

    The objectives of this retrospective cohort study are to evaluate the accuracy of pharmacogenetic warfarin dosing algorithms in predicting therapeutic dose and to determine if this degree of accuracy warrants the routine use of genotyping to prospectively dose patients newly started on warfarin. Seventy-one patients of an outpatient anticoagulation clinic at an academic medical center who were age 18 years or older on a stable, therapeutic warfarin dose with international normalized ratio (INR) goal between 2.0 and 3.0, and cytochrome P450 isoenzyme 2C9 (CYP2C9) and vitamin K epoxide reductase complex subunit 1 (VKORC1) genotypes available between January 1, 2007 and September 30, 2008 were included. Six pharmacogenetic warfarin dosing algorithms were identified from the medical literature. Additionally, a 5 mg fixed dose approach was evaluated. Three algorithms, Zhu et al. (Clin Chem 53:1199-1205, 2007), Gage et al. (J Clin Ther 84:326-331, 2008), and International Warfarin Pharmacogenetic Consortium (IWPC) (N Engl J Med 360:753-764, 2009) were similar in the primary accuracy endpoints with mean absolute error (MAE) ranging from 1.7 to 1.8 mg/day and coefficient of determination R (2) from 0.61 to 0.66. However, the Zhu et al. algorithm severely over-predicted dose (defined as >or=2x or >or=2 mg/day more than actual dose) in twice as many (14 vs. 7%) patients as Gage et al. 2008 and IWPC 2009. In conclusion, the algorithms published by Gage et al. 2008 and the IWPC 2009 were the two most accurate pharmacogenetically based equations available in the medical literature in predicting therapeutic warfarin dose in our study population. However, the degree of accuracy demonstrated does not support the routine use of genotyping to prospectively dose all patients newly started on warfarin.

  20. Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

    Science.gov (United States)

    Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

    2016-01-01

    With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.

  1. A Localization Method for Underwater Wireless Sensor Networks Based on Mobility Prediction and Particle Swarm Optimization Algorithms

    Directory of Open Access Journals (Sweden)

    Ying Zhang

    2016-02-01

    Full Text Available Due to their special environment, Underwater Wireless Sensor Networks (UWSNs are usually deployed over a large sea area and the nodes are usually floating. This results in a lower beacon node distribution density, a longer time for localization, and more energy consumption. Currently most of the localization algorithms in this field do not pay enough consideration on the mobility of the nodes. In this paper, by analyzing the mobility patterns of water near the seashore, a localization method for UWSNs based on a Mobility Prediction and a Particle Swarm Optimization algorithm (MP-PSO is proposed. In this method, the range-based PSO algorithm is used to locate the beacon nodes, and their velocities can be calculated. The velocity of an unknown node is calculated by using the spatial correlation of underwater object’s mobility, and then their locations can be predicted. The range-based PSO algorithm may cause considerable energy consumption and its computation complexity is a little bit high, nevertheless the number of beacon nodes is relatively smaller, so the calculation for the large number of unknown nodes is succinct, and this method can obviously decrease the energy consumption and time cost of localizing these mobile nodes. The simulation results indicate that this method has higher localization accuracy and better localization coverage rate compared with some other widely used localization methods in this field.

  2. A Localization Method for Underwater Wireless Sensor Networks Based on Mobility Prediction and Particle Swarm Optimization Algorithms.

    Science.gov (United States)

    Zhang, Ying; Liang, Jixing; Jiang, Shengming; Chen, Wei

    2016-02-06

    Due to their special environment, Underwater Wireless Sensor Networks (UWSNs) are usually deployed over a large sea area and the nodes are usually floating. This results in a lower beacon node distribution density, a longer time for localization, and more energy consumption. Currently most of the localization algorithms in this field do not pay enough consideration on the mobility of the nodes. In this paper, by analyzing the mobility patterns of water near the seashore, a localization method for UWSNs based on a Mobility Prediction and a Particle Swarm Optimization algorithm (MP-PSO) is proposed. In this method, the range-based PSO algorithm is used to locate the beacon nodes, and their velocities can be calculated. The velocity of an unknown node is calculated by using the spatial correlation of underwater object's mobility, and then their locations can be predicted. The range-based PSO algorithm may cause considerable energy consumption and its computation complexity is a little bit high, nevertheless the number of beacon nodes is relatively smaller, so the calculation for the large number of unknown nodes is succinct, and this method can obviously decrease the energy consumption and time cost of localizing these mobile nodes. The simulation results indicate that this method has higher localization accuracy and better localization coverage rate compared with some other widely used localization methods in this field.

  3. Optimization the Initial Weights of Artificial Neural Networks via Genetic Algorithm Applied to Hip Bone Fracture Prediction

    Directory of Open Access Journals (Sweden)

    Yu-Tzu Chang

    2012-01-01

    Full Text Available This paper aims to find the optimal set of initial weights to enhance the accuracy of artificial neural networks (ANNs by using genetic algorithms (GA. The sample in this study included 228 patients with first low-trauma hip fracture and 215 patients without hip fracture, both of them were interviewed with 78 questions. We used logistic regression to select 5 important factors (i.e., bone mineral density, experience of fracture, average hand grip strength, intake of coffee, and peak expiratory flow rate for building artificial neural networks to predict the probabilities of hip fractures. Three-layer (one hidden layer ANNs models with back-propagation training algorithms were adopted. The purpose in this paper is to find the optimal initial weights of neural networks via genetic algorithm to improve the predictability. Area under the ROC curve (AUC was used to assess the performance of neural networks. The study results showed the genetic algorithm obtained an AUC of 0.858±0.00493 on modeling data and 0.802 ± 0.03318 on testing data. They were slightly better than the results of our previous study (0.868±0.00387 and 0.796±0.02559, resp.. Thus, the preliminary study for only using simple GA has been proved to be effective for improving the accuracy of artificial neural networks.

  4. Improved algorithms and methods for room sound-field prediction by acoustical radiosity in arbitrary polyhedral rooms

    Science.gov (United States)

    Nosal, Eva-Marie; Hodgson, Murray; Ashdown, Ian

    2004-08-01

    This paper explores acoustical (or time-dependent) radiosity-a geometrical-acoustics sound-field prediction method that assumes diffuse surface reflection. The literature of acoustical radiosity is briefly reviewed and the advantages and disadvantages of the method are discussed. A discrete form of the integral equation that results from meshing the enclosure boundaries into patches is presented and used in a discrete-time algorithm. Furthermore, an averaging technique is used to reduce computational requirements. To generalize to nonrectangular rooms, a spherical-triangle method is proposed as a means of evaluating the integrals over solid angles that appear in the discrete form of the integral equation. The evaluation of form factors, which also appear in the numerical solution, is discussed for rectangular and nonrectangular rooms. This algorithm and associated methods are validated by comparison of the steady-state predictions for a spherical enclosure to analytical solutions.

  5. Prediction of Cancer Proteins by Integrating Protein Interaction, Domain Frequency, and Domain Interaction Data Using Machine Learning Algorithms

    Directory of Open Access Journals (Sweden)

    Chien-Hung Huang

    2015-01-01

    Full Text Available Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues’s method by employing the protein-protein interaction (PPI data, domain-domain interaction (DDI data, weighted domain frequency score (DFS, and cancer linker degree (CLD data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I using individual algorithm, (II combining algorithms, and (III combining the same classification types of algorithms. When compared with Aragues’s method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.

  6. A rapid learning and dynamic stepwise updating algorithm for flat neural networks and the application to time-series prediction.

    Science.gov (United States)

    Chen, C P; Wan, J Z

    1999-01-01

    A fast learning algorithm is proposed to find an optimal weights of the flat neural networks (especially, the functional-link network). Although the flat networks are used for nonlinear function approximation, they can be formulated as linear systems. Thus, the weights of the networks can be solved easily using a linear least-square method. This formulation makes it easier to update the weights instantly for both a new added pattern and a new added enhancement node. A dynamic stepwise updating algorithm is proposed to update the weights of the system on-the-fly. The model is tested on several time-series data including an infrared laser data set, a chaotic time-series, a monthly flour price data set, and a nonlinear system identification problem. The simulation results are compared to existing models in which more complex architectures and more costly training are needed. The results indicate that the proposed model is very attractive to real-time processes.

  7. How Are Mate Preferences Linked with Actual Mate Selection? Tests of Mate Preference Integration Algorithms Using Computer Simulations and Actual Mating Couples.

    Science.gov (United States)

    Conroy-Beam, Daniel; Buss, David M

    2016-01-01

    Prior mate preference research has focused on the content of mate preferences. Yet in real life, people must select mates among potentials who vary along myriad dimensions. How do people incorporate information on many different mate preferences in order to choose which partner to pursue? Here, in Study 1, we compare seven candidate algorithms for integrating multiple mate preferences in a competitive agent-based model of human mate choice evolution. This model shows that a Euclidean algorithm is the most evolvable solution to the problem of selecting fitness-beneficial mates. Next, across three studies of actual couples (Study 2: n = 214; Study 3: n = 259; Study 4: n = 294) we apply the Euclidean algorithm toward predicting mate preference fulfillment overall and preference fulfillment as a function of mate value. Consistent with the hypothesis that mate preferences are integrated according to a Euclidean algorithm, we find that actual mates lie close in multidimensional preference space to the preferences of their partners. Moreover, this Euclidean preference fulfillment is greater for people who are higher in mate value, highlighting theoretically-predictable individual differences in who gets what they want. These new Euclidean tools have important implications for understanding real-world dynamics of mate selection.

  8. How Are Mate Preferences Linked with Actual Mate Selection? Tests of Mate Preference Integration Algorithms Using Computer Simulations and Actual Mating Couples.

    Directory of Open Access Journals (Sweden)

    Daniel Conroy-Beam

    Full Text Available Prior mate preference research has focused on the content of mate preferences. Yet in real life, people must select mates among potentials who vary along myriad dimensions. How do people incorporate information on many different mate preferences in order to choose which partner to pursue? Here, in Study 1, we compare seven candidate algorithms for integrating multiple mate preferences in a competitive agent-based model of human mate choice evolution. This model shows that a Euclidean algorithm is the most evolvable solution to the problem of selecting fitness-beneficial mates. Next, across three studies of actual couples (Study 2: n = 214; Study 3: n = 259; Study 4: n = 294 we apply the Euclidean algorithm toward predicting mate preference fulfillment overall and preference fulfillment as a function of mate value. Consistent with the hypothesis that mate preferences are integrated according to a Euclidean algorithm, we find that actual mates lie close in multidimensional preference space to the preferences of their partners. Moreover, this Euclidean preference fulfillment is greater for people who are higher in mate value, highlighting theoretically-predictable individual differences in who gets what they want. These new Euclidean tools have important implications for understanding real-world dynamics of mate selection.

  9. Genomic risk prediction of aromatase inhibitor-related arthralgia in patients with breast cancer using a novel machine-learning algorithm.

    Science.gov (United States)

    Reinbolt, Raquel E; Sonis, Stephen; Timmers, Cynthia D; Fernández-Martínez, Juan Luis; Cernea, Ana; de Andrés-Galiana, Enrique J; Hashemi, Sepehr; Miller, Karin; Pilarski, Robert; Lustberg, Maryam B

    2018-01-01

    Many breast cancer (BC) patients treated with aromatase inhibitors (AIs) develop aromatase inhibitor-related arthralgia (AIA). Candidate gene studies to identify AIA risk are limited in scope. We evaluated the potential of a novel analytic algorithm (NAA) to predict AIA using germline single nucleotide polymorphisms (SNP) data obtained before treatment initiation. Systematic chart review of 700 AI-treated patients with stage I-III BC identified asymptomatic patients (n = 39) and those with clinically significant AIA resulting in AI termination or therapy switch (n = 123). Germline DNA was obtained and SNP genotyping performed using the Affymetrix UK BioBank Axiom Array to yield 695,277 SNPs. SNP clusters that most closely defined AIA risk were discovered using an NAA that sequentially combined statistical filtering and a machine-learning algorithm. NCBI PhenGenI and Ensemble databases defined gene attribution of the most discriminating SNPs. Phenotype, pathway, and ontologic analyses assessed functional and mechanistic validity. Demographics were similar in cases and controls. A cluster of 70 SNPs, correlating to 57 genes, was identified. This SNP group predicted AIA occurrence with a maximum accuracy of 75.93%. Strong associations with arthralgia, breast cancer, and estrogen phenotypes were seen in 19/57 genes (33%) and were functionally consistent. Using a NAA, we identified a 70 SNP cluster that predicted AIA risk with fair accuracy. Phenotype, functional, and pathway analysis of attributed genes was consistent with clinical phenotypes. This study is the first to link a specific SNP/gene cluster to AIA risk independent of candidate gene bias. © 2017 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  10. Prediction of composite fatigue life under variable amplitude loading using artificial neural network trained by genetic algorithm

    Science.gov (United States)

    Rohman, Muhamad Nur; Hidayat, Mas Irfan P.; Purniawan, Agung

    2018-04-01

    Neural networks (NN) have been widely used in application of fatigue life prediction. In the use of fatigue life prediction for polymeric-base composite, development of NN model is necessary with respect to the limited fatigue data and applicable to be used to predict the fatigue life under varying stress amplitudes in the different stress ratios. In the present paper, Multilayer-Perceptrons (MLP) model of neural network is developed, and Genetic Algorithm was employed to optimize the respective weights of NN for prediction of polymeric-base composite materials under variable amplitude loading. From the simulation result obtained with two different composite systems, named E-glass fabrics/epoxy (layups [(±45)/(0)2]S), and E-glass/polyester (layups [90/0/±45/0]S), NN model were trained with fatigue data from two different stress ratios, which represent limited fatigue data, can be used to predict another four and seven stress ratios respectively, with high accuracy of fatigue life prediction. The accuracy of NN prediction were quantified with the small value of mean square error (MSE). When using 33% from the total fatigue data for training, the NN model able to produce high accuracy for all stress ratios. When using less fatigue data during training (22% from the total fatigue data), the NN model still able to produce high coefficient of determination between the prediction result compared with obtained by experiment.

  11. Development and validation of a risk prediction algorithm for the recurrence of suicidal ideation among general population with low mood.

    Science.gov (United States)

    Liu, Y; Sareen, J; Bolton, J M; Wang, J L

    2016-03-15

    Suicidal ideation is one of the strongest predictors of recent and future suicide attempt. This study aimed to develop and validate a risk prediction algorithm for the recurrence of suicidal ideation among population with low mood 3035 participants from U.S National Epidemiologic Survey on Alcohol and Related Conditions with suicidal ideation at their lowest mood at baseline were included. The Alcohol Use Disorder and Associated Disabilities Interview Schedule, based on the DSM-IV criteria was used. Logistic regression modeling was conducted to derive the algorithm. Discrimination and calibration were assessed in the development and validation cohorts. In the development data, the proportion of recurrent suicidal ideation over 3 years was 19.5 (95% CI: 17.7, 21.5). The developed algorithm consisted of 6 predictors: age, feelings of emptiness, sudden mood changes, self-harm history, depressed mood in the past 4 weeks, interference with social activities in the past 4 weeks because of physical health or emotional problems and emptiness was the most important risk factor. The model had good discriminative power (C statistic=0.8273, 95% CI: 0.8027, 0.8520). The C statistic was 0.8091 (95% CI: 0.7786, 0.8395) in the external validation dataset and was 0.8193 (95% CI: 0.8001, 0.8385) in the combined dataset. This study does not apply to people with suicidal ideation who are not depressed. The developed risk algorithm for predicting the recurrence of suicidal ideation has good discrimination and excellent calibration. Clinicians can use this algorithm to stratify the risk of recurrence in patients and thus improve personalized treatment approaches, make advice and further intensive monitoring. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

    Science.gov (United States)

    Olivera, André Rodrigues; Roesler, Valter; Iochpe, Cirano; Schmidt, Maria Inês; Vigo, Álvaro; Barreto, Sandhi Maria; Duncan, Bruce Bartholow

    2017-01-01

    Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest. The best models were created using artificial neural networks and logistic regression. -These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

  13. Experimental 2.5-Gb/s QPSK WDM phase-modulated radio-over-fiber link with digital demodulation by a K-means algorithm

    DEFF Research Database (Denmark)

    Guerrero Gonzalez, Neil; Zibar, Darko; Caballero Jambrina, Antonio

    2010-01-01

    Highest reported bit rate of 2.5 Gb/s for optically phase modulated radio-over-fiber (RoF) link, employing digital coherent detection, is demonstrated. Demodulation of 3$,times,$ 2.5 Gb/s quadrature phase-shift keying modulated wavelength-division-multiplexed RoF channels is achieved after 79 km ...... of transmission through deployed fiber. Error-free performance (bit-error rate corresponding to $10^{{-}4}$) is achieved using a digital coherent receiver in combination with a $K$-means algorithm for radio-frequency phase recovery....

  14. Fast Algorithms for High-Order Sparse Linear Prediction with Applications to Speech Processing

    DEFF Research Database (Denmark)

    Jensen, Tobias Lindstrøm; Giacobello, Daniele; van Waterschoot, Toon

    2016-01-01

    In speech processing applications, imposing sparsity constraints on high-order linear prediction coefficients and prediction residuals has proven successful in overcoming some of the limitation of conventional linear predictive modeling. However, this modeling scheme, named sparse linear prediction...... problem with lower accuracy than in previous work. In the experimental analysis, we clearly show that a solution with lower accuracy can achieve approximately the same performance as a high accuracy solution both objectively, in terms of prediction gain, as well as with perceptual relevant measures, when...... evaluated in a speech reconstruction application....

  15. PREDICTIVE CONTROL OF A BATCH POLYMERIZATION SYSTEM USING A FEEDFORWARD NEURAL NETWORK WITH ONLINE ADAPTATION BY GENETIC ALGORITHM

    Directory of Open Access Journals (Sweden)

    A. Cancelier

    Full Text Available Abstract This study used a predictive controller based on an empirical nonlinear model comprising a three-layer feedforward neural network for temperature control of the suspension polymerization process. In addition to the offline training technique, an algorithm was also analyzed for online adaptation of its parameters. For the offline training, the network was statically trained and the genetic algorithm technique was used in combination with the least squares method. For online training, the network was trained on a recurring basis and only the technique of genetic algorithms was used. In this case, only the weights and bias of the output layer neuron were modified, starting from the parameters obtained from the offline training. From the experimental results obtained in a pilot plant, a good performance was observed for the proposed control system, with superior performance for the control algorithm with online adaptation of the model, particularly with respect to the presence of off-set for the case of the fixed parameters model.

  16. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable...

  17. Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data

    Science.gov (United States)

    Wong, Raymond K.; Mohammed, Sabah; Fiaidhi, Jinan; Sung, Yunsick

    2017-01-01

    Clinical data analysis and forecasting have made substantial contributions to disease control, prevention and detection. However, such data usually suffer from highly imbalanced samples in class distributions. In this paper, we aim to formulate effective methods to rebalance binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat algorithm, and apply them to empower the effects of synthetic minority over-sampling technique (SMOTE) for pre-processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reported in this paper reveal that the performance improvements obtained by the former methods are not scalable to larger data scales. The latter methods, which we call Adaptive Swarm Balancing Algorithms, lead to significant efficiency and effectiveness improvements on large datasets while the first method is invalid. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. The proposed methods lead to more credible performances of the classifier, and shortening the run time compared to brute-force method. PMID:28753613

  18. Step-by-Step Model for the Study of the Apriori Algorithm for Predictive Analysis

    Directory of Open Access Journals (Sweden)

    Daniel Grigore ROŞCA

    2015-06-01

    Full Text Available The goal of this paper was to develop an educational oriented application based on the Data Mining Apriori Algorithm which facilitates both the research and the study of data mining by graduate students. The application could be used to discover interesting patterns in the corpus of data and to measure the impact on the speed of execution as a function of problem constraints (value of support and confidence variables or size of the transactional data-base. The paper presents a brief overview of the Apriori Algorithm, aspects about the implementation of the algorithm using a step-by-step process, a discussion of the education-oriented user interface and the process of data mining of a test transactional data base. The impact of some constraints on the speed of the algorithm is also experimentally measured without a systematic review of different approaches to increase execution speed. Possible applications of the implementation, as well as its limits, are briefly reviewed.

  19. Development of a New Fractal Algorithm to Predict Quality Traits of MRI Loins

    DEFF Research Database (Denmark)

    Caballero, Daniel; Caro, Andrés; Amigo, José Manuel

    2017-01-01

    Traditionally, the quality traits of meat products have been estimated by means of physico-chemical methods. Computer vision algorithms on MRI have also been presented as an alternative to these destructive methods since MRI is non-destructive, non-ionizing and innocuous. The use of fractals...

  20. SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM

    International Nuclear Information System (INIS)

    Bobra, M. G.; Couvidat, S.

    2015-01-01

    We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities

  1. Prediction of breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm

    Science.gov (United States)

    Heidari, Morteza; Zargari Khuzani, Abolfazl; Hollingsworth, Alan B.; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qiu, Yuchen; Liu, Hong; Zheng, Bin

    2018-02-01

    In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.

  2. CorVue algorithm efficacy to predict heart failure in real life: Unnecessary and potentially misleading information?

    Science.gov (United States)

    Palfy, Julia Anna; Benezet-Mazuecos, Juan; Milla, Juan Martinez; Iglesias, Jose Antonio; de la Vieja, Juan Jose; Sanchez-Borque, Pepa; Miracle, Angel; Rubio, Jose Manuel

    2018-06-01

    Heart failure (HF) hospitalizations have a negative impact on quality of life and imply important costs. Intrathoracic impedance (ITI) variations detected by cardiac devices have been hypothesized to predict HF hospitalizations. Although Optivol™ algorithm (Medtronic) has been widely studied, CorVue™ algorithm (St. Jude Medical) long term efficacy has not been systematically evaluated in a "real life" cohort. CorVue™ was activated in ICD/CRT-D patients to store information about ITI measures. Clinical events (new episodes of HF requiring treatment and hospitalizations) and CorVue™ data were recorded every three months. Appropriate CorVue™ detection for HF was considered if it occurred in the four prior weeks to the clinical event. 53 ICD/CRT-D (26 ICD and 27 CRT-D) patients (67±1 years-old, 79% male) were included. Device position was subcutaneous in 28 patients. At inclusion, mean LVEF was 25±7% and 27 patients (51%) were in NYHA class I, 18 (34%) class II and 8 (15%) class III. After a mean follow-up of 17±9 months, 105 ITI drops alarms were detected in 32 patients (60%). Only six alarms were appropriate (true positive) and required hospitalization. Eighteen patients (34%) presented 25 clinical episodes (12 hospitalizations and 13 ER/ambulatory treatment modifications). Nineteen of these clinical episodes (76%) remained undetected by the CorVue™ (false negative). Sensitivity of CorVue™ resulted in 24%, specificity was 70%, positive predictive value of 6% and negative predictive value of 93%. CorVue™ showed a low sensitivity to predict HF events. Therefore, routinely activation of this algorithm could generate misleading information. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  3. GPS 2.1: enhanced prediction of kinase-specific phosphorylation sites with an algorithm of motif length selection.

    Science.gov (United States)

    Xue, Yu; Liu, Zexian; Cao, Jun; Ma, Qian; Gao, Xinjiao; Wang, Qingqi; Jin, Changjiang; Zhou, Yanhong; Wen, Longping; Ren, Jian

    2011-03-01

    As the most important post-translational modification of proteins, phosphorylation plays essential roles in all aspects of biological processes. Besides experimental approaches, computational prediction of phosphorylated proteins with their kinase-specific phosphorylation sites has also emerged as a popular strategy, for its low-cost, fast-speed and convenience. In this work, we developed a kinase-specific phosphorylation sites predictor of GPS 2.1 (Group-based Prediction System), with a novel but simple approach of motif length selection (MLS). By this approach, the robustness of the prediction system was greatly improved. All algorithms in GPS old versions were also reserved and integrated in GPS 2.1. The online service and local packages of GPS 2.1 were implemented in JAVA 1.5 (J2SE 5.0) and freely available for academic researches at: http://gps.biocuckoo.org.

  4. Optimization of a predictive controller of a pressurized water reactor Xenon oscillation using the particle swarm optimization algorithm

    International Nuclear Information System (INIS)

    Medeiros, Jose Antonio Carlos Canedo; Machado, Marcelo Dornellas; Lima, Alan Miranda M. de; Schirru, Roberto

    2007-01-01

    Predictive control systems are control systems that use a model of the controlled system (plant), used to predict the future behavior of the plant allowing the establishment of an anticipative control based on a future condition of the plant, and an optimizer that, considering a future time horizon of the plant output and a recent horizon of the control action, determines the controller's outputs to optimize a performance index of the controlled plant. The predictive control system does not require analytical models of the plant; the model of predictor of the plant can be learned from historical data of operation of the plant. The optimizer of the predictive controller establishes the strategy of the control: the minimization of a performance index (objective function) is done so that the present and future control actions are computed in such a way to minimize the objective function. The control strategy, implemented by the optimizer, induces the formation of an optimal control mechanism whose effect is to reduce the stabilization time, the 'overshoot' and 'undershoot', minimize the control actuation so that a compromise among those objectives is attained. The optimizer of the predictive controller is usually implemented using gradient-based algorithms. In this work we use the Particle Swarm Optimization algorithm (PSO) in the optimizer component of a predictive controller applied in the control of the xenon oscillation of a pressurized water reactor (PWR). The PSO is a stochastic optimization technique applied in several disciplines, simple and capable of providing a global optimal for high complexity problems and difficult to be optimized, providing in many cases better results than those obtained by other conventional and/or other artificial optimization techniques. (author)

  5. Prediction Method for Rain Rate and Rain Propagation Attenuation for K-Band Satellite Communications Links in Tropical Areas

    Directory of Open Access Journals (Sweden)

    Baso Maruddani

    2015-01-01

    Full Text Available This paper deals with the prediction method using hidden Markov model (HMM for rain rate and rain propagation attenuation for K-band satellite communication link at tropical area. As is well known, the K-band frequency is susceptible of being affected by atmospheric condition, especially in rainy condition. The wavelength of K-band frequency which approaches to the size of rain droplet causes the signal strength is easily attenuated and absorbed by the rain droplet. In order to keep the quality of system performance for K-band satellite communication link, therefore a special attention has to be paid for rain rate and rain propagation attenuation. Thus, a prediction method for rain rate and rain propagation attenuation based on HMM is developed to process the measurement data. The measured and predicted data are then compared with the ITU-R recommendation. From the result, it is shown that the measured and predicted data show similarity with the model of ITU-R P.837-5 recommendation for rain rate and the model of ITU-R P.618-10 recommendation for rain propagation attenuation. Meanwhile, statistical data for measured and predicted data such as fade duration and interfade duration have insignificant discrepancy with the model of ITU-R P.1623-1 recommendation.

  6. An empirical evaluation of classification algorithms for fault prediction in open source projects

    Directory of Open Access Journals (Sweden)

    Arvinder Kaur

    2018-01-01

    Full Text Available Creating software with high quality has become difficult these days with the fact that size and complexity of the developed software is high. Predicting the quality of software in early phases helps to reduce testing resources. Various statistical and machine learning techniques are used for prediction of the quality of the software. In this paper, six machine learning models have been used for software quality prediction on five open source software. Varieties of metrics have been evaluated for the software including C & K, Henderson & Sellers, McCabe etc. Results show that Random Forest and Bagging produce good results while Naïve Bayes is least preferable for prediction.

  7. A Building Model Framework for a Genetic Algorithm Multi-objective Model Predictive Control

    DEFF Research Database (Denmark)

    Arendt, Krzysztof; Ionesi, Ana; Jradi, Muhyiddine

    2016-01-01

    Model Predictive Control (MPC) of building systems is a promising approach to optimize building energy performance. In contrast to traditional control strategies which are reactive in nature, MPC optimizes the utilization of resources based on the predicted effects. It has been shown that energy ...

  8. Predictive factors for renal failure and a control and treatment algorithm

    Directory of Open Access Journals (Sweden)

    Denise de Paula Cerqueira

    2014-04-01

    Full Text Available OBJECTIVES: to evaluate the renal function of patients in an intensive care unit, to identify the predisposing factors for the development of renal failure, and to develop an algorithm to help in the control of the disease.METHOD: exploratory, descriptive, prospective study with a quantitative approach.RESULTS: a total of 30 patients (75.0% were diagnosed with kidney failure and the main factors associated with this disease were: advanced age, systemic arterial hypertension, diabetes mellitus, lung diseases, and antibiotic use. Of these, 23 patients (76.6% showed a reduction in creatinine clearance in the first 24 hours of hospitalization.CONCLUSION: a decline in renal function was observed in a significant number of subjects, therefore, an algorithm was developed with the aim of helping in the control of renal failure in a practical and functional way.

  9. Predicting Post-Translational Modifications from Local Sequence Fragments Using Machine Learning Algorithms: Overview and Best Practices.

    Science.gov (United States)

    Tatjewski, Marcin; Kierczak, Marcin; Plewczynski, Dariusz

    2017-01-01

    Here, we present two perspectives on the task of predicting post translational modifications (PTMs) from local sequence fragments using machine learning algorithms. The first is the description of the fundamental steps required to construct a PTM predictor from the very beginning. These steps include data gathering, feature extraction, or machine-learning classifier selection. The second part of our work contains the detailed discussion of more advanced problems which are encountered in PTM prediction task. Probably the most challenging issues which we have covered here are: (1) how to address the training data class imbalance problem (we also present statistics describing the problem); (2) how to properly set up cross-validation folds with an approach which takes into account the homology of protein data records, to address this problem we present our folds-over-clusters algorithm; and (3) how to efficiently reach for new sources of learning features. Presented techniques and notes resulted from intense studies in the field, performed by our and other groups, and can be useful both for researchers beginning in the field of PTM prediction and for those who want to extend the repertoire of their research techniques.

  10. The "weakest link" as an indicator of cognitive vulnerability differentially predicts symptom dimensions of anxiety in adolescents in China.

    Science.gov (United States)

    Wang, Junyi; Wang, Danyang; Cui, Lixia; McWhinnie, Chad M; Wang, Li; Xiao, Jing

    2017-08-01

    This multiwave longitudinal study examined the cognitive vulnerability-stress component of hopelessness theory to differentially predict symptom dimensions of anxiety using a "weakest link" approach in a sample of adolescents from Hunan Province, China. Baseline and 6-month follow-up data were obtained from 553 middle-school students. During an initial assessment, participants completed measures of assessing their weakest links, anxious symptoms, and the occurrence of stress. Participants subsequently completed measures assessing stress, and anxious symptoms one a month for six months. Higher weakest link scores were associated with greater increases in the harm avoidance and separation anxiety/panic dimensions, but not the physical or social anxiety dimension, of anxious symptoms following stress in Chinese adolescents. These results support the applicability of the "weakest link" approach, derived from hopelessness theory, in Chinese adolescents. Weakest link scores as cognitive vulnerability factors may play a role in the development of anxious symptoms, especially in the cognitive dimensions (e.g., harm avoidance and separation anxiety/panic). Our findings also have potential value in explaining the effectiveness of cognitive relevant therapy in treating the cognitive dimensions of anxious symptoms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Intelligent Swarm Firefly Algorithm for the Prediction of China’s National Electricity Consumption

    OpenAIRE

    Zhang, Guangfeng; Chen, Yi; Yu, Yongnian; Wu, Shaomin

    2017-01-01

    China’s energy consumption is the world’s largest and is still rising, leading to concerns of energy shortage and environmental issues. It is, therefore, necessary to estimate the energy demand and to examine the dynamic nature of the electricity consumption. In this paper, we develop a nonlinear model of energy consumption and utilise a computational intelligence approach, specifcally a swarm frefly algorithm with a variable population, to examine China’s electricity consumption with historic...

  12. Predicting availability functions in time-dependent complex systems with SAEDES simulation algorithms

    International Nuclear Information System (INIS)

    Faulin, Javier; Juan, Angel A.; Serrat, Carles; Bargueno, Vicente

    2008-01-01

    In this paper, we propose the use of discrete-event simulation (DES) as an efficient methodology to obtain estimates of both survival and availability functions in time-dependent real systems-such as telecommunication networks or distributed computer systems. We discuss the use of DES in reliability and availability studies, not only as an alternative to the use of analytical and probabilistic methods, but also as a complementary way to: (i) achieve a better understanding of the system internal behavior and (ii) find out the relevance of each component under reliability/availability considerations. Specifically, this paper describes a general methodology and two DES algorithms, called SAEDES, which can be used to analyze a wide range of time-dependent complex systems, including those presenting multiple states, dependencies among failure/repair times or non-perfect maintenance policies. These algorithms can provide valuable information, specially during the design stages, where different scenarios can be compared in order to select a system design offering adequate reliability and availability levels. Two case studies are discussed, using a C/C++ implementation of the SAEDES algorithms, to show some potential applications of our approach

  13. Predicting availability functions in time-dependent complex systems with SAEDES simulation algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Faulin, Javier [Department of Statistics and Operations Research, Los Magnolios Building, First Floor, Campus Arrosadia, Public University of Navarre, 31006 Pamplona, Navarre (Spain)], E-mail: javier.faulin@unavarra.es; Juan, Angel A. [Department of Applied Mathematics I, Av. Doctor Maranon 44-50, Technical University of Catalonia, 08028 Barcelona (Spain)], E-mail: angel.alejandro.juan@upc.edu; Serrat, Carles [Department of Applied Mathematics I, Av. Doctor Maranon 44-50, Technical University of Catalonia, 08028 Barcelona (Spain)], E-mail: carles.serrat@upc.edu; Bargueno, Vicente [Department of Applied Mathematics I, ETS Ingenieros Industriales, Universidad Nacional de Educacion a Distancia, 28080 Madrid (Spain)], E-mail: vbargueno@ind.uned.es

    2008-11-15

    In this paper, we propose the use of discrete-event simulation (DES) as an efficient methodology to obtain estimates of both survival and availability functions in time-dependent real systems-such as telecommunication networks or distributed computer systems. We discuss the use of DES in reliability and availability studies, not only as an alternative to the use of analytical and probabilistic methods, but also as a complementary way to: (i) achieve a better understanding of the system internal behavior and (ii) find out the relevance of each component under reliability/availability considerations. Specifically, this paper describes a general methodology and two DES algorithms, called SAEDES, which can be used to analyze a wide range of time-dependent complex systems, including those presenting multiple states, dependencies among failure/repair times or non-perfect maintenance policies. These algorithms can provide valuable information, specially during the design stages, where different scenarios can be compared in order to select a system design offering adequate reliability and availability levels. Two case studies are discussed, using a C/C++ implementation of the SAEDES algorithms, to show some potential applications of our approach.

  14. An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms.

    Science.gov (United States)

    Hua, Hong-Li; Zhang, Fa-Zhan; Labena, Abraham Alemayehu; Dong, Chuan; Jin, Yan-Ting; Guo, Feng-Biao

    Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus , which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.

  15. Traffic Flow Prediction Using MI Algorithm and Considering Noisy and Data Loss Conditions: An Application to Minnesota Traffic Flow Prediction

    Directory of Open Access Journals (Sweden)

    Seyed Hadi Hosseini

    2014-10-01

    Full Text Available Traffic flow forecasting is useful for controlling traffic flow, traffic lights, and travel times. This study uses a multi-layer perceptron neural network and the mutual information (MI technique to forecast traffic flow and compares the prediction results with conventional traffic flow forecasting methods. The MI method is used to calculate the interdependency of historical traffic data and future traffic flow. In numerical case studies, the proposed traffic flow forecasting method was tested against data loss, changes in weather conditions, traffic congestion, and accidents. The outcomes were highly acceptable for all cases and showed the robustness of the proposed flow forecasting method.

  16. ProBiS tools (algorithm, database, and web servers) for predicting and modeling of biologically interesting proteins.

    Science.gov (United States)

    Konc, Janez; Janežič, Dušanka

    2017-09-01

    ProBiS (Protein Binding Sites) Tools consist of algorithm, database, and web servers for prediction of binding sites and protein ligands based on the detection of structurally similar binding sites in the Protein Data Bank. In this article, we review the operations that ProBiS Tools perform, provide comments on the evolution of the tools, and give some implementation details. We review some of its applications to biologically interesting proteins. ProBiS Tools are freely available at http://probis.cmm.ki.si and http://probis.nih.gov. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Linking linear programming and spatial simulation models to predict landscape effects of forest management alternatives

    Science.gov (United States)

    Eric J. Gustafson; L. Jay Roberts; Larry A. Leefers

    2006-01-01

    Forest management planners require analytical tools to assess the effects of alternative strategies on the sometimes disparate benefits from forests such as timber production and wildlife habitat. We assessed the spatial patterns of alternative management strategies by linking two models that were developed for different purposes. We used a linear programming model (...

  18. Hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) and its application to predicting key process variables.

    Science.gov (United States)

    He, Yan-Lin; Xu, Yuan; Geng, Zhi-Qiang; Zhu, Qun-Xiong

    2016-03-01

    In this paper, a hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) is proposed. Firstly, an improved functional link neural network with small norm of expanded weights and high input-output correlation (SNEWHIOC-FLNN) was proposed for enhancing the generalization performance of FLNN. Unlike the traditional FLNN, the expanded variables of the original inputs are not directly used as the inputs in the proposed SNEWHIOC-FLNN model. The original inputs are attached to some small norm of expanded weights. As a result, the correlation coefficient between some of the expanded variables and the outputs is enhanced. The larger the correlation coefficient is, the more relevant the expanded variables tend to be. In the end, the expanded variables with larger correlation coefficient are selected as the inputs to improve the performance of the traditional FLNN. In order to test the proposed SNEWHIOC-FLNN model, three UCI (University of California, Irvine) regression datasets named Housing, Concrete Compressive Strength (CCS), and Yacht Hydro Dynamics (YHD) are selected. Then a hybrid model based on the improved FLNN integrating with partial least square (IFLNN-PLS) was built. In IFLNN-PLS model, the connection weights are calculated using the partial least square method but not the error back propagation algorithm. Lastly, IFLNN-PLS was developed as an intelligent measurement model for accurately predicting the key variables in the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. Simulation results illustrated that the IFLNN-PLS could significant improve the prediction performance. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  19. An investigation of classification algorithms for predicting HIV drug resistance without genotype resistance testing

    CSIR Research Space (South Africa)

    Brandt, P

    2014-01-01

    Full Text Available is limited in low-resource settings. In this paper we investigate machine learning techniques for drug resistance prediction from routine treatment and laboratory data to help clinicians select patients for confirmatory genotype testing. The techniques...

  20. The comparison of predictive scheduling algorithms for different sizes of job shop scheduling problems

    Science.gov (United States)

    Paprocka, I.; Kempa, W. M.; Grabowik, C.; Kalinowski, K.; Krenczyk, D.

    2016-08-01

    In the paper a survey of predictive and reactive scheduling methods is done in order to evaluate how the ability of prediction of reliability characteristics influences over robustness criteria. The most important reliability characteristics are: Mean Time to Failure, Mean Time of Repair. Survey analysis is done for a job shop scheduling problem. The paper answers the question: what method generates robust schedules in the case of a bottleneck failure occurrence before, at the beginning of planned maintenance actions or after planned maintenance actions? Efficiency of predictive schedules is evaluated using criteria: makespan, total tardiness, flow time, idle time. Efficiency of reactive schedules is evaluated using: solution robustness criterion and quality robustness criterion. This paper is the continuation of the research conducted in the paper [1], where the survey of predictive and reactive scheduling methods is done only for small size scheduling problems.

  1. Global Warming: Predicting OPEC Carbon Dioxide Emissions from Petroleum Consumption Using Neural Network and Hybrid Cuckoo Search Algorithm.

    Science.gov (United States)

    Chiroma, Haruna; Abdul-kareem, Sameem; Khan, Abdullah; Nawi, Nazri Mohd; Gital, Abdulsalam Ya'u; Shuib, Liyana; Abubakar, Adamu I; Rahman, Muhammad Zubair; Herawan, Tutut

    2015-01-01

    Global warming is attracting attention from policy makers due to its impacts such as floods, extreme weather, increases in temperature by 0.7°C, heat waves, storms, etc. These disasters result in loss of human life and billions of dollars in property. Global warming is believed to be caused by the emissions of greenhouse gases due to human activities including the emissions of carbon dioxide (CO2) from petroleum consumption. Limitations of the previous methods of predicting CO2 emissions and lack of work on the prediction of the Organization of the Petroleum Exporting Countries (OPEC) CO2 emissions from petroleum consumption have motivated this research. The OPEC CO2 emissions data were collected from the Energy Information Administration. Artificial Neural Network (ANN) adaptability and performance motivated its choice for this study. To improve effectiveness of the ANN, the cuckoo search algorithm was hybridised with accelerated particle swarm optimisation for training the ANN to build a model for the prediction of OPEC CO2 emissions. The proposed model predicts OPEC CO2 emissions for 3, 6, 9, 12 and 16 years with an improved accuracy and speed over the state-of-the-art methods. An accurate prediction of OPEC CO2 emissions can serve as a reference point for propagating the reorganisation of economic development in OPEC member countries with the view of reducing CO2 emissions to Kyoto benchmarks--hence, reducing global warming. The policy implications are discussed in the paper.

  2. Global Warming: Predicting OPEC Carbon Dioxide Emissions from Petroleum Consumption Using Neural Network and Hybrid Cuckoo Search Algorithm.

    Directory of Open Access Journals (Sweden)

    Haruna Chiroma

    Full Text Available Global warming is attracting attention from policy makers due to its impacts such as floods, extreme weather, increases in temperature by 0.7°C, heat waves, storms, etc. These disasters result in loss of human life and billions of dollars in property. Global warming is believed to be caused by the emissions of greenhouse gases due to human activities including the emissions of carbon dioxide (CO2 from petroleum consumption. Limitations of the previous methods of predicting CO2 emissions and lack of work on the prediction of the Organization of the Petroleum Exporting Countries (OPEC CO2 emissions from petroleum consumption have motivated this research.The OPEC CO2 emissions data were collected from the Energy Information Administration. Artificial Neural Network (ANN adaptability and performance motivated its choice for this study. To improve effectiveness of the ANN, the cuckoo search algorithm was hybridised with accelerated particle swarm optimisation for training the ANN to build a model for the prediction of OPEC CO2 emissions. The proposed model predicts OPEC CO2 emissions for 3, 6, 9, 12 and 16 years with an improved accuracy and speed over the state-of-the-art methods.An accurate prediction of OPEC CO2 emissions can serve as a reference point for propagating the reorganisation of economic development in OPEC member countries with the view of reducing CO2 emissions to Kyoto benchmarks--hence, reducing global warming. The policy implications are discussed in the paper.

  3. EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm.

    Science.gov (United States)

    Kim, Seong Gon; Harwani, Mrudul; Grama, Ananth; Chaterji, Somali

    2016-12-08

    We present EP-DNN, a protocol for predicting enhancers based on chromatin features, in different cell types. Specifically, we use a deep neural network (DNN)-based architecture to extract enhancer signatures in a representative human embryonic stem cell type (H1) and a differentiated lung cell type (IMR90). We train EP-DNN using p300 binding sites, as enhancers, and TSS and random non-DHS sites, as non-enhancers. We perform same-cell and cross-cell predictions to quantify the validation rate and compare against two state-of-the-art methods, DEEP-ENCODE and RFECS. We find that EP-DNN has superior accuracy with a validation rate of 91.6%, relative to 85.3% for DEEP-ENCODE and 85.5% for RFECS, for a given number of enhancer predictions and also scales better for a larger number of enhancer predictions. Moreover, our H1 → IMR90 predictions turn out to be more accurate than IMR90 → IMR90, potentially because H1 exhibits a richer signature set and our EP-DNN model is expressive enough to extract these subtleties. Our work shows how to leverage the full expressivity of deep learning models, using multiple hidden layers, while avoiding overfitting on the training data. We also lay the foundation for exploration of cross-cell enhancer predictions, potentially reducing the need for expensive experimentation.

  4. EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm

    Science.gov (United States)

    Kim, Seong Gon; Harwani, Mrudul; Grama, Ananth; Chaterji, Somali

    2016-12-01

    We present EP-DNN, a protocol for predicting enhancers based on chromatin features, in different cell types. Specifically, we use a deep neural network (DNN)-based architecture to extract enhancer signatures in a representative human embryonic stem cell type (H1) and a differentiated lung cell type (IMR90). We train EP-DNN using p300 binding sites, as enhancers, and TSS and random non-DHS sites, as non-enhancers. We perform same-cell and cross-cell predictions to quantify the validation rate and compare against two state-of-the-art methods, DEEP-ENCODE and RFECS. We find that EP-DNN has superior accuracy with a validation rate of 91.6%, relative to 85.3% for DEEP-ENCODE and 85.5% for RFECS, for a given number of enhancer predictions and also scales better for a larger number of enhancer predictions. Moreover, our H1 → IMR90 predictions turn out to be more accurate than IMR90 → IMR90, potentially because H1 exhibits a richer signature set and our EP-DNN model is expressive enough to extract these subtleties. Our work shows how to leverage the full expressivity of deep learning models, using multiple hidden layers, while avoiding overfitting on the training data. We also lay the foundation for exploration of cross-cell enhancer predictions, potentially reducing the need for expensive experimentation.

  5. Exploitation of complex network topology for link prediction in biological interactomes

    KAUST Repository

    Alanis Lobato, Gregorio

    2014-01-01

    In this work, we propose three novel and powerful approaches for the prediction of interactions in biological networks and conclude that it is possible to mine the topology of these complex system representations and produce reliable

  6. Simplification of neural network model for predicting local power distributions of BWR fuel bundle using learning algorithm with forgetting

    International Nuclear Information System (INIS)

    Tanabe, Akira; Yamamoto, Toru; Shinfuku, Kimihiro; Nakamae, Takuji; Nishide, Fusayo.

    1995-01-01

    Previously a two-layered neural network model was developed to predict the relation between fissile enrichment of each fuel rod and local power distribution in a BWR fuel bundle. This model was obtained intuitively based on 33 patterns of training signals after an intensive survey of the models. Recently, a learning algorithm with forgetting was reported to simplify neural network models. It is an interesting subject what kind of model will be obtained if this algorithm is applied to the complex three-layered model which learns the same training signals. A three-layered model which is expanded to have direct connections between the 1st and the 3rd layer elements has been constructed and the learning method of normal back propagation was applied first to this model. The forgetting algorithm was then added to this learning process. The connections concerned with the 2nd layer elements disappeared and the 2nd layer has become unnecessary. It took a longer computing time by an order to learn the same training signals than the simple back propagation, but the two-layered model was obtained autonomously from the expanded three-layered model. (author)

  7. Ant colony optimization algorithm for interpretable Bayesian classifiers combination: application to medical predictions.

    Directory of Open Access Journals (Sweden)

    Salah Bouktif

    Full Text Available Prediction and classification techniques have been well studied by machine learning researchers and developed for several real-word problems. However, the level of acceptance and success of prediction models are still below expectation due to some difficulties such as the low performance of prediction models when they are applied in different environments. Such a problem has been addressed by many researchers, mainly from the machine learning community. A second problem, principally raised by model users in different communities, such as managers, economists, engineers, biologists, and medical practitioners, etc., is the prediction models' interpretability. The latter is the ability of a model to explain its predictions and exhibit the causality relationships between the inputs and the outputs. In the case of classification, a successful way to alleviate the low performance is to use ensemble classiers. It is an intuitive strategy to activate collaboration between different classifiers towards a better performance than individual classier. Unfortunately, ensemble classifiers method do not take into account the interpretability of the final classification outcome. It even worsens the original interpretability of the individual classifiers. In this paper we propose a novel implementation of classifiers combination approach that does not only promote the overall performance but also preserves the interpretability of the resulting model. We propose a solution based on Ant Colony Optimization and tailored for the case of Bayesian classifiers. We validate our proposed solution with case studies from medical domain namely, heart disease and Cardiotography-based predictions, problems where interpretability is critical to make appropriate clinical decisions.The datasets, Prediction Models and software tool together with supplementary materials are available at http://faculty.uaeu.ac.ae/salahb/ACO4BC.htm.

  8. Using Tree Detection Algorithms to Predict Stand Sapwood Area, Basal Area and Stocking Density in Eucalyptus regnans Forest

    Directory of Open Access Journals (Sweden)

    Dominik Jaskierniak

    2015-06-01

    Full Text Available Managers of forested water supply catchments require efficient and accurate methods to quantify changes in forest water use due to changes in forest structure and density after disturbance. Using Light Detection and Ranging (LiDAR data with as few as 0.9 pulses m−2, we applied a local maximum filtering (LMF method and normalised cut (NCut algorithm to predict stocking density (SDen of a 69-year-old Eucalyptus regnans forest comprising 251 plots with resolution of the order of 0.04 ha. Using the NCut method we predicted basal area (BAHa per hectare and sapwood area (SAHa per hectare, a well-established proxy for transpiration. Sapwood area was also indirectly estimated with allometric relationships dependent on LiDAR derived SDen and BAHa using a computationally efficient procedure. The individual tree detection (ITD rates for the LMF and NCut methods respectively had 72% and 68% of stems correctly identified, 25% and 20% of stems missed, and 2% and 12% of stems over-segmented. The significantly higher computational requirement of the NCut algorithm makes the LMF method more suitable for predicting SDen across large forested areas. Using NCut derived ITD segments, observed versus predicted stand BAHa had R2 ranging from 0.70 to 0.98 across six catchments, whereas a generalised parsimonious model applied to all sites used the portion of hits greater than 37 m in height (PH37 to explain 68% of BAHa. For extrapolating one ha resolution SAHa estimates across large forested catchments, we found that directly relating SAHa to NCut derived LiDAR indices (R2 = 0.56 was slightly more accurate but computationally more demanding than indirect estimates of SAHa using allometric relationships consisting of BAHa (R2 = 0.50 or a sapwood perimeter index, defined as (BAHaSDen½ (R2 = 0.48.

  9. Feature selection for disruption prediction from scratch in JET by using genetic algorithms and probabilistic predictors

    Energy Technology Data Exchange (ETDEWEB)

    Pereira, Augusto, E-mail: augusto.pereira@ciemat.es [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Vega, Jesús; Moreno, Raúl [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Dormido-Canto, Sebastián [Dpto. Informática y Automática – UNED, Madrid (Spain); Rattá, Giuseppe A. [Laboratorio Nacional de Fusión, CIEMAT, Madrid (Spain); Pavón, Fernando [Dpto. Informática y Automática – UNED, Madrid (Spain)

    2015-10-15

    Recently, a probabilistic classifier has been developed at JET to be used as predictor from scratch. It has been applied to a database of 1237 JET ITER-like wall (ILW) discharges (of which 201 disrupted) with good results: success rate of 94% and false alarm rate of 4.21%. A combinatorial analysis between 14 features to ensure the selection of the best ones to achieve good enough results in terms of success rate and false alarm rate was performed. All possible combinations with a number of features between 2 and 7 were tested and 9893 different predictors were analyzed. An important drawback in this analysis was the time required to compute the results that can be estimated in 1731 h (∼2.4 months). Genetic algorithms (GA) are searching algorithms that simulate the process of natural selection. In this article, the GA and the Venn predictors are combined with the objective not only of finding good enough features within the 14 available ones but also of reducing the computational time requirements. Five different performance metrics as measures of the GA fitness function have been evaluated. The best metric was the measurement called Informedness, with just 6 generations (168 predictors at 29.4 h).

  10. Water quality of Danube Delta systems: ecological status and prediction using machine-learning algorithms.

    Science.gov (United States)

    Stoica, C; Camejo, J; Banciu, A; Nita-Lazar, M; Paun, I; Cristofor, S; Pacheco, O R; Guevara, M

    2016-01-01

    Environmental issues have a worldwide impact on water bodies, including the Danube Delta, the largest European wetland. The Water Framework Directive (2000/60/EC) implementation operates toward solving environmental issues from European and national level. As a consequence, the water quality and the biocenosis structure was altered, especially the composition of the macro invertebrate community which is closely related to habitat and substrate heterogeneity. This study aims to assess the ecological status of Southern Branch of the Danube Delta, Saint Gheorghe, using benthic fauna and a computational method as an alternative for monitoring the water quality in real time. The analysis of spatial and temporal variability of unicriterial and multicriterial indices were used to assess the current status of aquatic systems. In addition, chemical status was characterized. Coliform bacteria and several chemical parameters were used to feed machine-learning (ML) algorithms to simulate a real-time classification method. Overall, the assessment of the water bodies indicated a moderate ecological status based on the biological quality elements or a good ecological status based on chemical and ML algorithms criteria.

  11. Feature selection for disruption prediction from scratch in JET by using genetic algorithms and probabilistic predictors

    International Nuclear Information System (INIS)

    Pereira, Augusto; Vega, Jesús; Moreno, Raúl; Dormido-Canto, Sebastián; Rattá, Giuseppe A.; Pavón, Fernando

    2015-01-01

    Recently, a probabilistic classifier has been developed at JET to be used as predictor from scratch. It has been applied to a database of 1237 JET ITER-like wall (ILW) discharges (of which 201 disrupted) with good results: success rate of 94% and false alarm rate of 4.21%. A combinatorial analysis between 14 features to ensure the selection of the best ones to achieve good enough results in terms of success rate and false alarm rate was performed. All possible combinations with a number of features between 2 and 7 were tested and 9893 different predictors were analyzed. An important drawback in this analysis was the time required to compute the results that can be estimated in 1731 h (∼2.4 months). Genetic algorithms (GA) are searching algorithms that simulate the process of natural selection. In this article, the GA and the Venn predictors are combined with the objective not only of finding good enough features within the 14 available ones but also of reducing the computational time requirements. Five different performance metrics as measures of the GA fitness function have been evaluated. The best metric was the measurement called Informedness, with just 6 generations (168 predictors at 29.4 h).

  12. Development of a novel diagnostic algorithm to predict NASH in HCV-positive patients.

    Science.gov (United States)

    Gallotta, Andrea; Paneghetti, Laura; Mrázová, Viera; Bednárová, Adriana; Kružlicová, Dáša; Frecer, Vladimir; Miertus, Stanislav; Biasiolo, Alessandra; Martini, Andrea; Pontisso, Patrizia; Fassina, Giorgio

    2018-05-01

    Non-alcoholic steato-hepatitis (NASH) is a severe disease characterised by liver inflammation and progressive hepatic fibrosis, which may progress to cirrhosis and hepatocellular carcinoma. Clinical evidence suggests that in hepatitis C virus patients steatosis and NASH are associated with faster fibrosis progression and hepatocellular carcinoma. A safe and reliable non-invasive diagnostic method to detect NASH at its early stages is still needed to prevent progression of the disease. We prospectively enrolled 91 hepatitis C virus-positive patients with histologically proven chronic liver disease: 77 patients were included in our study; of these, 10 had NASH. For each patient, various clinical and serological variables were collected. Different algorithms combining squamous cell carcinoma antigen-immunoglobulin-M (SCCA-IgM) levels with other common clinical data were created to provide the probability of having NASH. Our analysis revealed a statistically significant correlation between the histological presence of NASH and SCCA-IgM, insulin, homeostasis model assessment, haemoglobin, high-density lipoprotein and ferritin levels, and smoke. Compared to the use of a single marker, algorithms that combined four, six or seven variables identified NASH with higher accuracy. The best diagnostic performance was obtained with the logistic regression combination, which included all seven variables correlated with NASH. The combination of SCCA-IgM with common clinical data shows promising diagnostic performance for the detection of NASH in hepatitis C virus patients.

  13. A finite element-based algorithm for rubbing induced vibration prediction in rotors

    Science.gov (United States)

    Behzad, Mehdi; Alvandi, Mehdi; Mba, David; Jamali, Jalil

    2013-10-01

    In this paper, an algorithm is developed for more realistic investigation of rotor-to-stator rubbing vibration, based on finite element theory with unilateral contact and friction conditions. To model the rotor, cross sections are assumed to be radially rigid. A finite element discretization based on traditional beam theories which sufficiently accounts for axial and transversal flexibility of the rotor is used. A general finite element discretization model considering inertial and viscoelastic characteristics of the stator is used for modeling the stator. Therefore, for contact analysis, only the boundary of the stator is discretized. The contact problem is defined as the contact between the circular rigid cross section of the rotor and “nodes” of the stator only. Next, Gap function and contact conditions are described for the contact problem. Two finite element models of the rotor and the stator are coupled via the Lagrange multipliers method in order to obtain the constrained equation of motion. A case study of the partial rubbing is simulated using the algorithm. The synchronous and subsynchronous responses of the partial rubbing are obtained for different rotational speeds. In addition, a sensitivity analysis is carried out with respect to the initial clearance, the stator stiffness, the damping parameter, and the coefficient of friction. There is a good agreement between the result of this research and the experimental result in the literature.

  14. Integrated Detection and Prediction of Influenza Activity for Real-Time Surveillance: Algorithm Design.

    Science.gov (United States)

    Spreco, Armin; Eriksson, Olle; Dahlström, Örjan; Cowling, Benjamin John; Timpka, Toomas

    2017-06-15

    Influenza is a viral respiratory disease capable of causing epidemics that represent a threat to communities worldwide. The rapidly growing availability of electronic "big data" from diagnostic and prediagnostic sources in health care and public health settings permits advance of a new generation of methods for local detection and prediction of winter influenza seasons and influenza pandemics. The aim of this study was to present a method for integrated detection and prediction of influenza virus activity in local settings using electronically available surveillance data and to evaluate its performance by retrospective application on authentic data from a Swedish county. An integrated detection and prediction method was formally defined based on a design rationale for influenza detection and prediction methods adapted for local surveillance. The novel method was retrospectively applied on data from the winter influenza season 2008-09 in a Swedish county (population 445,000). Outcome data represented individuals who met a clinical case definition for influenza (based on International Classification of Diseases version 10 [ICD-10] codes) from an electronic health data repository. Information from calls to a telenursing service in the county was used as syndromic data source. The novel integrated detection and prediction method is based on nonmechanistic statistical models and is designed for integration in local health information systems. The method is divided into separate modules for detection and prediction of local influenza virus activity. The function of the detection module is to alert for an upcoming period of increased load of influenza cases on local health care (using influenza-diagnosis data), whereas the function of the prediction module is to predict the timing of the activity peak (using syndromic data) and its intensity (using influenza-diagnosis data). For detection modeling, exponential regression was used based on the assumption that the beginning

  15. A Dual-Channel Acquisition Method Based on Extended Replica Folding Algorithm for Long Pseudo-Noise Code in Inter-Satellite Links.

    Science.gov (United States)

    Zhao, Hongbo; Chen, Yuying; Feng, Wenquan; Zhuang, Chen

    2018-05-25

    Inter-satellite links are an important component of the new generation of satellite navigation systems, characterized by low signal-to-noise ratio (SNR), complex electromagnetic interference and the short time slot of each satellite, which brings difficulties to the acquisition stage. The inter-satellite link in both Global Positioning System (GPS) and BeiDou Navigation Satellite System (BDS) adopt the long code spread spectrum system. However, long code acquisition is a difficult and time-consuming task due to the long code period. Traditional folding methods such as extended replica folding acquisition search technique (XFAST) and direct average are largely restricted because of code Doppler and additional SNR loss caused by replica folding. The dual folding method (DF-XFAST) and dual-channel method have been proposed to achieve long code acquisition in low SNR and high dynamic situations, respectively, but the former is easily affected by code Doppler and the latter is not fast enough. Considering the environment of inter-satellite links and the problems of existing algorithms, this paper proposes a new long code acquisition algorithm named dual-channel acquisition method based on the extended replica folding algorithm (DC-XFAST). This method employs dual channels for verification. Each channel contains an incoming signal block. Local code samples are folded and zero-padded to the length of the incoming signal block. After a circular FFT operation, the correlation results contain two peaks of the same magnitude and specified relative position. The detection process is eased through finding the two largest values. The verification takes all the full and partial peaks into account. Numerical results reveal that the DC-XFAST method can improve acquisition performance while acquisition speed is guaranteed. The method has a significantly higher acquisition probability than folding methods XFAST and DF-XFAST. Moreover, with the advantage of higher detection

  16. A Dual-Channel Acquisition Method Based on Extended Replica Folding Algorithm for Long Pseudo-Noise Code in Inter-Satellite Links

    Directory of Open Access Journals (Sweden)

    Hongbo Zhao

    2018-05-01

    Full Text Available Inter-satellite links are an important component of the new generation of satellite navigation systems, characterized by low signal-to-noise ratio (SNR, complex electromagnetic interference and the short time slot of each satellite, which brings difficulties to the acquisition stage. The inter-satellite link in both Global Positioning System (GPS and BeiDou Navigation Satellite System (BDS adopt the long code spread spectrum system. However, long code acquisition is a difficult and time-consuming task due to the long code period. Traditional folding methods such as extended replica folding acquisition search technique (XFAST and direct average are largely restricted because of code Doppler and additional SNR loss caused by replica folding. The dual folding method (DF-XFAST and dual-channel method have been proposed to achieve long code acquisition in low SNR and high dynamic situations, respectively, but the former is easily affected by code Doppler and the latter is not fast enough. Considering the environment of inter-satellite links and the problems of existing algorithms, this paper proposes a new long code acquisition algorithm named dual-channel acquisition method based on the extended replica folding algorithm (DC-XFAST. This method employs dual channels for verification. Each channel contains an incoming signal block. Local code samples are folded and zero-padded to the length of the incoming signal block. After a circular FFT operation, the correlation results contain two peaks of the same magnitude and specified relative position. The detection process is eased through finding the two largest values. The verification takes all the full and partial peaks into account. Numerical results reveal that the DC-XFAST method can improve acquisition performance while acquisition speed is guaranteed. The method has a significantly higher acquisition probability than folding methods XFAST and DF-XFAST. Moreover, with the advantage of higher

  17. Cy-preds: An algorithm and a web service for the analysis and prediction of cysteine reactivity.

    Science.gov (United States)

    Soylu, İnanç; Marino, Stefano M

    2016-02-01

    Cysteine (Cys) is a critically important amino acid, serving a variety of functions within proteins including structural roles, catalysis, and regulation of function through post-translational modifications. Predicting which Cys residues are likely to be reactive is a very sought after feature. Few methods are currently available for the task, either based on evaluation of physicochemical features (e.g., pKa and exposure) or based on similarity with known instances. In this study, we developed an algorithm (named HAL-Cy) which blends previous work with novel implementations to identify reactive Cys from nonreactive. HAL-Cy present two major components: (i) an energy based part, rooted on the evaluation of H-bond network contributions and (ii) a knowledge based part, composed of different profiling approaches (including a newly developed weighting matrix for sequence profiling). In our evaluations, HAL-Cy provided significantly improved performances, as tested in comparisons with existing approaches. We implemented our algorithm in a web service (Cy-preds), the ultimate product of our work; we provided it with a variety of additional features, tools, and options: Cy-preds is capable of performing fully automated calculations for a thorough analysis of Cys reactivity in proteins, ranging from reactivity predictions (e.g., with HAL-Cy) to functional characterization. We believe it represents an original, effective, and very useful addition to the current array of tools available to scientists involved in redox biology, Cys biochemistry, and structural bioinformatics. © 2015 Wiley Periodicals, Inc.

  18. Applying a machine learning model using a locally preserving projection based feature regeneration algorithm to predict breast cancer risk

    Science.gov (United States)

    Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin

    2018-03-01

    Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p breast cancer detected in the next subsequent mammography screening.

  19. Prediction of China's coal production-environmental pollution based on a hybrid genetic algorithm-system dynamics model

    International Nuclear Information System (INIS)

    Yu Shiwei; Wei Yiming

    2012-01-01

    This paper proposes a hybrid model based on genetic algorithm (GA) and system dynamics (SD) for coal production–environmental pollution load in China. GA has been utilized in the optimization of the parameters of the SD model to reduce implementation subjectivity. The chain of “Economic development–coal demand–coal production–environmental pollution load” of China in 2030 was predicted, and scenarios were analyzed. Results show that: (1) GA performs well in optimizing the parameters of the SD model objectively and in simulating the historical data; (2) The demand for coal energy continuously increases, although the coal intensity has actually decreased because of China's persistent economic development. Furthermore, instead of reaching a turning point by 2030, the environmental pollution load continuously increases each year even under the scenario where coal intensity decreased by 20% and investment in pollution abatement increased by 20%; (3) For abating the amount of “three types of wastes”, reducing the coal intensity is more effective than reducing the polluted production per tonne of coal and increasing investment in pollution control. - Highlights: ► We propos a GA-SD model for China's coal production-pollution prediction. ► Genetic algorithm (GA) can objectively and accurately optimize parameters of system dynamics (SD) model. ► Environmental pollution in China is projected to grow in our scenarios by 2030. ► The mechanism of reducing waste production per tonne of coal mining is more effective than others.

  20. Creating Aerosol Types from CHemistry (CATCH): A New Algorithm to Extend the Link Between Remote Sensing and Models

    Science.gov (United States)

    Dawson, K. W.; Meskhidze, N.; Burton, S. P.; Johnson, M. S.; Kacenelenbogen, M. S.; Hostetler, C. A.; Hu, Y.

    2017-11-01

    Current remote sensing methods can identify aerosol types within an atmospheric column, presenting an opportunity to incrementally bridge the gap between remote sensing and models. Here a new algorithm was designed for Creating Aerosol Types from CHemistry (CATCH). CATCH-derived aerosol types—dusty mix, maritime, urban, smoke, and fresh smoke—are based on first-generation airborne High Spectral Resolution Lidar (HSRL-1) retrievals during the Ship-Aircraft Bio-Optical Research (SABOR) campaign, July/August 2014. CATCH is designed to derive aerosol types from model output of chemical composition. CATCH-derived aerosol types are determined by multivariate clustering of model-calculated variables that have been trained using retrievals of aerosol types from HSRL-1. CATCH-derived aerosol types (with the exception of smoke) compare well with HSRL-1 retrievals during SABOR with an average difference in aerosol optical depth (AOD) methods. In the future, spaceborne HSRL-1 and CATCH can be used to gain insight into chemical composition of aerosol types, reducing uncertainties in estimates of aerosol radiative forcing.

  1. The prediction of preschool children's weight from family environment factors: gender-linked differences.

    Science.gov (United States)

    Tremblay, Line; Rinaldi, Christina M

    2010-12-01

    The main objective of this study was to test an explanatory model predicting preschool girls' and boys' body weight from certain child variables (food intake, sedentary behaviors, and eating habits), as well as family variables (interaction during mealtime and level of family financial resources allocated to meeting children's eating needs). A randomized stratified subsample of parents was selected from a larger study (Quebec Longitudinal Study of Child Development, QLSCD-1998-2002), with a breakdown of 581 girls and 611 boys of 4 years of age. Children's skin fold ratio, weight, height, and Body Mass Index (BMI) were recorded. Questionnaires were administered to parents (usually the mother). Using structural equation modeling (SEM) separately for girls and boys, the family environment model of healthy weight development was tested. Results yielded a good fit of the model for both genders. For boys, significant predictors of body weight in the model were family food insecurity and conflicts during mealtime. Healthy eating was predicted by food insecurity, mealtime conflicts, and sedentary behaviors. Mealtime conflicts predicted sedentary behaviors. For girls, none of the variables predicted body weight, however food insecurity predicted less healthy eating. These results outline the importance of prevention and intervention within families with young children. Copyright © 2010 Elsevier Ltd. All rights reserved.

  2. The missing link: Predicting connectomes from noisy and partially observed tract tracing data

    DEFF Research Database (Denmark)

    Hinne, Max; Meijers, Annet; Bakker, Rembrandt

    2017-01-01

    a high chance of being connected, while regions far apart are most likely disconnected in the connectome. After learning the latent embedding from the connections that we did observe, the latent space allows us to predict connections that have not been probed previously. We apply the methodology to two....... In this paper, we suggest that instead of probing all possible connections, hitherto unknown connections may be predicted from the data that is already available. Our approach uses a 'latent space model' that embeds the connectivity in an abstract physical space. Regions that are close in the latent space have...... connectivity data sets of the macaque, where we demonstrate that the latent space model is successful in predicting unobserved connectivity, outperforming two baselines and an alternative model in nearly all cases. Furthermore, we show how the latent spatial embedding may be used to integrate multimodal...

  3. A Tutorial on Nonlinear Time-Series Data Mining in Engineering Asset Health and Reliability Prediction: Concepts, Models, and Algorithms

    Directory of Open Access Journals (Sweden)

    Ming Dong

    2010-01-01

    Full Text Available The primary objective of engineering asset management is to optimize assets service delivery potential and to minimize the related risks and costs over their entire life through the development and application of asset health and usage management in which the health and reliability prediction plays an important role. In real-life situations where an engineering asset operates under dynamic operational and environmental conditions, the lifetime of an engineering asset is generally described as monitored nonlinear time-series data and subject to high levels of uncertainty and unpredictability. It has been proved that application of data mining techniques is very useful for extracting relevant features which can be used as parameters for assets diagnosis and prognosis. In this paper, a tutorial on nonlinear time-series data mining in engineering asset health and reliability prediction is given. Besides that an overview on health and reliability prediction techniques for engineering assets is covered, this tutorial will focus on concepts, models, algorithms, and applications of hidden Markov models (HMMs and hidden semi-Markov models (HSMMs in engineering asset health prognosis, which are representatives of recent engineering asset health prediction techniques.

  4. Predicting incomplete gene microarray data with the use of supervised learning algorithms

    CSIR Research Space (South Africa)

    Twala, B

    2010-10-01

    Full Text Available that prediction using supervised learning can be improved in probabilistic terms given incomplete microarray data. This imputation approach is based on the a priori probability of each value determined from the instances at that node of a decision tree (PDT...

  5. Algorithmic prediction of inter-song similarity in Western popular music

    NARCIS (Netherlands)

    Novello, A.; Par, van de S.L.J.D.E.; McKinney, M.F.; Kohlrausch, A.G.

    2013-01-01

    We investigate a method for automatic extraction of inter-song similarity for songs selected from several genres of Western popular music. The specific purpose of this approach is to evaluate the predictive power of different feature extraction sets based on human perception of music similarity and

  6. A spectral clustering search algorithm for predicting shallow landslide size and location

    Science.gov (United States)

    Dino Bellugi; David G. Milledge; William E. Dietrich; Jim A. McKean; J. Taylor Perron; Erik B. Sudderth; Brian Kazian

    2015-01-01

    The potential hazard and geomorphic significance of shallow landslides depend on their location and size. Commonly applied one-dimensional stability models do not include lateral resistances and cannot predict landslide size. Multi-dimensional models must be applied to specific geometries, which are not known a priori, and testing all possible geometries is...

  7. Predicting forest attributes from climate data using a recursive partitioning and regression tree algorithm

    Science.gov (United States)

    Greg C. Liknes; Christopher W. Woodall; Charles H. Perry

    2009-01-01

    Climate information frequently is included in geospatial modeling efforts to improve the predictive capability of other data sources. The selection of an appropriate climate data source requires consideration given the number of choices available. With regard to climate data, there are a variety of parameters (e.g., temperature, humidity, precipitation), time intervals...

  8. A neural network - based algorithm for predicting stone - free status after ESWL therapy.

    Science.gov (United States)

    Seckiner, Ilker; Seckiner, Serap; Sen, Haluk; Bayrak, Omer; Dogan, Kazim; Erturhan, Sakip

    2017-01-01

    The prototype artificial neural network (ANN) model was developed using data from patients with renal stone, in order to predict stone-free status and to help in planning treatment with Extracorporeal Shock Wave Lithotripsy (ESWL) for kidney stones. Data were collected from the 203 patients including gender, single or multiple nature of the stone, location of the stone, infundibulopelvic angle primary or secondary nature of the stone, status of hydronephrosis, stone size after ESWL, age, size, skin to stone distance, stone density and creatinine, for eleven variables. Regression analysis and the ANN method were applied to predict treatment success using the same series of data. Subsequently, patients were divided into three groups by neural network software, in order to implement the ANN: training group (n=139), validation group (n=32), and the test group (n=32). ANN analysis demonstrated that the prediction accuracy of the stone-free rate was 99.25% in the training group, 85.48% in the validation group, and 88.70% in the test group. Successful results were obtained to predict the stone-free rate, with the help of the ANN model designed by using a series of data collected from real patients in whom ESWL was implemented to help in planning treatment for kidney stones. Copyright® by the International Brazilian Journal of Urology.

  9. Thermogram breast cancer prediction approach based on Neutrosophic sets and fuzzy c-means algorithm.

    Science.gov (United States)

    Gaber, Tarek; Ismail, Gehad; Anter, Ahmed; Soliman, Mona; Ali, Mona; Semary, Noura; Hassanien, Aboul Ella; Snasel, Vaclav

    2015-08-01

    The early detection of breast cancer makes many women survive. In this paper, a CAD system classifying breast cancer thermograms to normal and abnormal is proposed. This approach consists of two main phases: automatic segmentation and classification. For the former phase, an improved segmentation approach based on both Neutrosophic sets (NS) and optimized Fast Fuzzy c-mean (F-FCM) algorithm was proposed. Also, post-segmentation process was suggested to segment breast parenchyma (i.e. ROI) from thermogram images. For the classification, different kernel functions of the Support Vector Machine (SVM) were used to classify breast parenchyma into normal or abnormal cases. Using benchmark database, the proposed CAD system was evaluated based on precision, recall, and accuracy as well as a comparison with related work. The experimental results showed that our system would be a very promising step toward automatic diagnosis of breast cancer using thermograms as the accuracy reached 100%.

  10. Prediction of heart disease using apache spark analysing decision trees and gradient boosting algorithm

    Science.gov (United States)

    Chugh, Saryu; Arivu Selvan, K.; Nadesh, RK

    2017-11-01

    Numerous destructive things influence the working arrangement of human body as hypertension, smoking, obesity, inappropriate medication taking which causes many contrasting diseases as diabetes, thyroid, strokes and coronary diseases. The impermanence and horribleness of the environment situation is also the reason for the coronary disease. The structure of Apache start relies on the evolution which requires gathering of the data. To break down the significance of use programming focused on data structure the Apache stop ought to be utilized and it gives various central focuses as it is fast in light as it uses memory worked in preparing. Apache Spark continues running on dispersed environment and chops down the data in bunches giving a high profitability rate. Utilizing mining procedure as a part of the determination of coronary disease has been exhaustively examined indicating worthy levels of precision. Decision trees, Neural Network, Gradient Boosting Algorithm are the various apache spark proficiencies which help in collecting the information.

  11. A General Mathematical Algorithm for Predicting the Course of Unfused Tetanic Contractions of Motor Units in Rat Muscle.

    Directory of Open Access Journals (Sweden)

    Rositsa Raikova

    Full Text Available An unfused tetanus of a motor unit (MU evoked by a train of pulses at variable interpulse intervals is the sum of non-equal twitch-like responses to these stimuli. A tool for a precise prediction of these successive contractions for MUs of different physiological types with different contractile properties is crucial for modeling the whole muscle behavior during various types of activity. The aim of this paper is to develop such a general mathematical algorithm for the MUs of the medial gastrocnemius muscle of rats. For this purpose, tetanic curves recorded for 30 MUs (10 slow, 10 fast fatigue-resistant and 10 fast fatigable were mathematically decomposed into twitch-like contractions. Each contraction was modeled by the previously proposed 6-parameter analytical function, and the analysis of these six parameters allowed us to develop a prediction algorithm based on the following input data: parameters of the initial twitch, the maximum force of a MU and the series of pulses. Linear relationship was found between the normalized amplitudes of the successive contractions and the remainder between the actual force levels at which the contraction started and the maximum tetanic force. The normalization was made according to the amplitude of the first decomposed twitch. However, the respective approximation lines had different specific angles with respect to the ordinate. These angles had different and non-overlapping ranges for slow and fast MUs. A sensitivity analysis concerning this slope was performed and the dependence between the angles and the maximal fused tetanic force normalized to the amplitude of the first contraction was approximated by a power function. The normalized MU contraction and half-relaxation times were approximated by linear functions depending on the normalized actual force levels at which each contraction starts. The normalization was made according to the contraction time of the first contraction. The actual force levels

  12. Semi-quantitative assessments of dextran toxicity on corneal endothelium: conceptual design of a predictive algorithm.

    Science.gov (United States)

    Filev, Filip; Oezcan, Ceprail; Feuerstacke, Jana; Linke, Stephan J; Wulff, Birgit; Hellwinkel, Olaf J C

    2017-03-01

    Dextran is added to corneal culture medium for at least 8 h prior to transplantation to ensure that the cornea is osmotically dehydrated. It is presumed that dextran has a certain toxic effect on corneal endothelium but the degree and the kinetics of this effect have not been quantified so far. We consider that such data regarding the toxicity of dextran on the corneal endothelium could have an impact on scheduling and logistics of corneal preparation in eye banking. In retrospective statistic analyses, we compared the progress of corneal endothelium (endothelium cell loss per day) of 1334 organ-cultured corneal explants in media with and without dextran. Also, the influence of donor-age, sex and cause of death on the observed dextran-mediated effect on endothelial cell counts was studied. Corneas cultured in dextran-free medium showed a mean endothelium cell count decrease of 0.7% per day. Dextran supplementation led to a mean endothelium cell loss of 2.01% per day; this reflects an increase by the factor of 2.9. The toxic impact of dextran was found to be time dependent; while the prevailing part of the effect was observed within the first 24 h after dextran-addition. Donor age, sex and cause of death did not seem to have an influence on the dextran-mediated toxicity. Based on these findings, we could design an algorithm which approximately describes the kinetics of dextran-toxicity. We reproduced the previously reported toxic effect of dextran on the corneal endothelium in vitro. Additionally, this is the first work that provides an algorithmic instrument for the semi-quantitative calculation of the putative endothelium cell count decrease in dextran containing medium for a given incubation time and could thus influence the time management and planning of corneal transplantations.

  13. Artificial neural network analysis based on genetic algorithm to predict the performance characteristics of a cross flow cooling tower

    Science.gov (United States)

    Wu, Jiasheng; Cao, Lin; Zhang, Guoqiang

    2018-02-01

    Cooling tower of air conditioning has been widely used as cooling equipment, and there will be broad application prospect if it can be reversibly used as heat source under heat pump heating operation condition. In view of the complex non-linear relationship of each parameter in the process of heat and mass transfer inside tower, In this paper, the BP neural network model based on genetic algorithm optimization (GABP neural network model) is established for the reverse use of cross flow cooling tower. The model adopts the structure of 6 inputs, 13 hidden nodes and 8 outputs. With this model, the outlet air dry bulb temperature, wet bulb temperature, water temperature, heat, sensible heat ratio and heat absorbing efficiency, Lewis number, a total of 8 the proportion of main performance parameters were predicted. Furthermore, the established network model is used to predict the water temperature and heat absorption of the tower at different inlet temperatures. The mean relative error MRE between BP predicted value and experimental value are 4.47%, 3.63%, 2.38%, 3.71%, 6.35%,3.14%, 13.95% and 6.80% respectively; the mean relative error MRE between GABP predicted value and experimental value are 2.66%, 3.04%, 2.27%, 3.02%, 6.89%, 3.17%, 11.50% and 6.57% respectively. The results show that the prediction results of GABP network model are better than that of BP network model; the simulation results are basically consistent with the actual situation. The GABP network model can well predict the heat and mass transfer performance of the cross flow cooling tower.

  14. A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures

    Science.gov (United States)

    2014-01-01

    Background Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information. Results We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0. Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure. Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets. Conclusions Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in

  15. Prediction of rain effects on earth-space communication links operating in the 10 to 35 GHz frequency range

    Science.gov (United States)

    Stutzman, Warren L.

    1989-01-01

    This paper reviews the effects of precipitation on earth-space communication links operating the 10 to 35 GHz frequency range. Emphasis is on the quantitative prediction of rain attenuation and depolarization. Discussions center on the models developed at Virginia Tech. Comments on other models are included as well as literature references to key works. Also included is the system level modeling for dual polarized communication systems with techniques for calculating antenna and propagation medium effects. Simple models for the calculation of average annual attenuation and cross-polarization discrimination (XPD) are presented. Calculation of worst month statistics are also presented.

  16. A Business Intelligence Model to Predict Bankruptcy using Financial Domain Ontology with Association Rule Mining Algorithm

    OpenAIRE

    Martin, A.; Manjula, M.; Venkatesan, Dr. V. Prasanna

    2011-01-01

    Today in every organization financial analysis provides the basis for understanding and evaluating the results of business operations and delivering how well a business is doing. This means that the organizations can control the operational activities primarily related to corporate finance. One way that doing this is by analysis of bankruptcy prediction. This paper develops an ontological model from financial information of an organization by analyzing the Semantics of the financial statement...

  17. Sensitivity and Uncertainty Analysis for Streamflow Prediction Using Different Objective Functions and Optimization Algorithms: San Joaquin California

    Science.gov (United States)

    Paul, M.; Negahban-Azar, M.

    2017-12-01

    The hydrologic models usually need to be calibrated against observed streamflow at the outlet of a particular drainage area through a careful model calibration. However, a large number of parameters are required to fit in the model due to their unavailability of the field measurement. Therefore, it is difficult to calibrate the model for a large number of potential uncertain model parameters. This even becomes more challenging if the model is for a large watershed with multiple land uses and various geophysical characteristics. Sensitivity analysis (SA) can be used as a tool to identify most sensitive model parameters which affect the calibrated model performance. There are many different calibration and uncertainty analysis algorithms which can be performed with different objective functions. By incorporating sensitive parameters in streamflow simulation, effects of the suitable algorithm in improving model performance can be demonstrated by the Soil and Water Assessment Tool (SWAT) modeling. In this study, the SWAT was applied in the San Joaquin Watershed in California covering 19704 km2 to calibrate the daily streamflow. Recently, sever water stress escalating due to intensified climate variability, prolonged drought and depleting groundwater for agricultural irrigation in this watershed. Therefore it is important to perform a proper uncertainty analysis given the uncertainties inherent in hydrologic modeling to predict the spatial and temporal variation of the hydrologic process to evaluate the impacts of different hydrologic variables. The purpose of this study was to evaluate the sensitivity and uncertainty of the calibrated parameters for predicting streamflow. To evaluate the sensitivity of the calibrated parameters three different optimization algorithms (Sequential Uncertainty Fitting- SUFI-2, Generalized Likelihood Uncertainty Estimation- GLUE and Parameter Solution- ParaSol) were used with four different objective functions (coefficient of determination

  18. A Bimodel Algorithm with Data-Divider to Predict Stock Index

    Directory of Open Access Journals (Sweden)

    Zhaoyue Wang

    2018-01-01

    Full Text Available There is not yet reliable software for stock prediction, because most experts of this area have been trying to predict an exact stock index. Considering that the fluctuation of a stock index usually is no more than 1% in a day, the error between the forecasted and the actual values should be no more than 0.5%. It is too difficult to realize. However, forecasting whether a stock index will rise or fall does not need to be so exact a numerical value. A few scholars noted the fact, but their systems do not yet work very well because different periods of a stock have different inherent laws. So, we should not depend on a single model or a set of parameters to solve the problem. In this paper, we developed a data-divider to divide a set of historical stock data into two parts according to rising period and falling period, training, respectively, two neural networks optimized by a GA. Above all, the data-divider enables us to avoid the most difficult problem, the effect of unexpected news, which could hardly be predicted. Experiments show that the accuracy of our method increases 20% compared to those of traditional methods.

  19. Fine-Tuning Nonhomogeneous Regression for Probabilistic Precipitation Forecasts: Unanimous Predictions, Heavy Tails, and Link Functions

    DEFF Research Database (Denmark)

    Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.

    2017-01-01

    functions for the optimization of regression coefficients for the scale parameter. These three refinements are tested for 10 stations in a small area of the European Alps for lead times from +24 to +144 h and accumulation periods of 24 and 6 h. Together, they improve probabilistic forecasts...... to obtain automatically corrected weather forecasts. This study applies the nonhomogenous regression framework as a state-of-the-art ensemble postprocessing technique to predict a full forecast distribution and improves its forecast performance with three statistical refinements. First of all, a novel split...... for precipitation amounts as well as the probability of precipitation events over the default postprocessing method. The improvements are largest for the shorter accumulation periods and shorter lead times, where the information of unanimous ensemble predictions is more important....

  20. Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm

    Science.gov (United States)

    Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.

    2017-01-01

    This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.

  1. ANFIS-based genetic algorithm for predicting the optimal sizing coefficient of photovoltaic supply systems

    Energy Technology Data Exchange (ETDEWEB)

    Mellit, A. [Medea Univ., Medea (Algeria). Inst. of Science Engineering, Dept. of Electronics

    2007-07-01

    Stand-alone photovoltaic (PV) power supply systems are regarded as reliable and economical sources of electricity in rural remote areas, particularly in developing countries. However, the sizing of stand-alone photovoltaic (PV) systems is an important part of the system design. Choosing the optimal number of solar cell panels and the size of the storage battery to be used for a certain application at a particular site is an important economical problem. In this paper, a genetic algorithm (GA) and an adaptive neuro-fuzzy inference scheme (ANFIS) were proposed as a means for determining the optimal size of PV system, particularly, in isolated areas. The GA-ANFIS model was shown to be suitable for modelling the optimal sizing parameters of PVS systems. The GA was used to determine the PV-array capacity and the storage capacity for 60 sites. From this database, 56 pairs relative to 56 sites were used for training the network. Four pairs were used for testing and validating the ANFIS model. A correlation of 99 per cent was achieved when complete unknown data parameters were presented to the model. The proposed technique provided more accurate results than the alternative artificial neural network (ANN) with GA. The advantage of this model was that it could estimate the PV-array area and the useful capacity of the battery from only geographical coordinates. Although the technique was applied and tested in Algeria, it can be generalized for any location in the world. 15 refs., 4 tabs., 8 figs.

  2. Prediction and optimization of fuel cell performance using a multi-objective genetic algorithm

    Energy Technology Data Exchange (ETDEWEB)

    Marques Hobold, Gustavo [Laboratory of Energy Conversion Engineering and Technology, Federal University of Santa Catarina (Brazil); Washington University in St. Louis, MO 63130 (United States); Agarwal, Ramesh K. [Department of Mechanical Engineering and Materials Science, Washington University in St. Louis, MO 63130 (United States)

    2013-07-01

    The attention that is currently being given to the emission of pollutant gases in the atmosphere has made the fuel cell (FC), an energy conversion device that cleanly converts chemical energy into electrical energy, a good alternative to other technologies that still use carbon-based fuels. The temperature plays an important role on the efficiency of an FC as it influences directly the humidity of the membrane, the reversible thermodynamic potential and the partial pressure of water; therefore the thermal control of the fuel cell is the focus of this paper. We present models for both high and low temperature fuel cells based on the solid-oxide fuel cell (SOFC) and the polymer electrolyte membrane fuel cell (PEMFC). A thermodynamic analysis is performed on the cells and the methods of controlling their temperature are discussed. The cell parameters are optimized for both high and low temperatures using a Java-based multi-objective genetic algorithm, which makes use of the logic of the biological theory of evolution to classify individual parameters based on a fitness function in order to maximize the power of the fuel cell. Applications to high and low temperature fuel cells are discussed.

  3. An overview of data mining algorithms in drug induced toxicity prediction.

    Science.gov (United States)

    Omer, Ankur; Singh, Poonam; Yadav, N K; Singh, R K

    2014-04-01

    The growth in chemical diversity has increased the need to adjudicate the toxicity of different chemical compounds raising the burden on the demand of animal testing. The toxicity evaluation requires time consuming and expensive undertaking, leading to the deprivation of the methods employed for screening chemicals pointing towards the need to develop more efficient toxicity assessment systems. Computational approaches have reduced the time as well as the cost for evaluating the toxicity and kinetic behavior of any chemical. The accessibility of a large amount of data and the intense need of turning this data into useful information have attracted the attention towards data mining. Machine Learning, one of the powerful data mining techniques has evolved as the most effective and potent tool for exploring new insights on combinatorial relationships among various experimental data generated. The article accounts on some sophisticated machine learning algorithms like Artificial Neural Networks (ANN), Support Vector Machine (SVM), k-mean clustering and Self Organizing Maps (SOM) with some of the available tools used for classification, sorting and toxicological evaluation of data, clarifying, how data mining and machine learning interact cooperatively to facilitate knowledge discovery. Addressing the association of some commonly used expert systems, we briefly outline some real world applications to consider the crucial role of data set partitioning.

  4. Linking precipitation, evapotranspiration and soil moisture content for the improvement of predictability over land

    Science.gov (United States)

    Catalano, Franco; Alessandri, Andrea; De Felice, Matteo

    2013-04-01

    Climate change scenarios are expected to show an intensification of the hydrological cycle together with modifications of evapotranspiration and soil moisture content. Evapotranspiration changes have been already evidenced for the end of the 20th century. The variance of evapotranspiration has been shown to be strongly related to the variance of precipitation over land. Nevertheless, the feedbacks between evapotranspiration, soil moisture and precipitation have not yet been completely understood at present-day. Furthermore, soil moisture reservoirs are associated to a memory and thus their proper initialization may have a strong influence on predictability. In particular, the linkage between precipitation and soil moisture is modulated by the effects on evapotranspiration. Therefore, the investigation of the coupling between these variables appear to be of primary importance for the improvement of predictability over the continents. The coupled manifold (CM) technique (Navarra and Tribbia 2005) is a method designed to separate the effects of the variability of two variables which are connected. This method has proved to be successful for the analysis of different climate fields, like precipitation, vegetation and sea surface temperature. In particular, the coupled variables reveal patterns that may be connected with specific phenomena, thus providing hints regarding potential predictability. In this study we applied the CM to recent observational datasets of precipitation (from CRU), evapotranspiration (from GIMMS and MODIS satellite-based estimates) and soil moisture content (from ESA) spanning a time period of 23 years (1984-2006) with a monthly frequency. Different data stratification (monthly, seasonal, summer JJA) have been employed to analyze the persistence of the patterns and their characteristical time scales and seasonality. The three variables considered show a significant coupling among each other. Interestingly, most of the signal of the

  5. Linking removal targets to the ecological effects of invaders: a predictive model and field test.

    Science.gov (United States)

    Green, Stephanie J; Dulvy, Nicholas K; Brooks, Annabelle M L; Akins, John L; Cooper, Andrew B; Miller, Skylar; Côté, Isabelle M

    Species invasions have a range of negative effects on recipient ecosystems, and many occur at a scale and magnitude that preclude complete eradication. When complete extirpation is unlikely with available management resources, an effective strategy may be to suppress invasive populations below levels predicted to cause undesirable ecological change. We illustrated this approach by developing and testing targets for the control of invasive Indo-Pacific lionfish (Pterois volitans and P. miles) on Western Atlantic coral reefs. We first developed a size-structured simulation model of predation by lionfish on native fish communities, which we used to predict threshold densities of lionfish beyond which native fish biomass should decline. We then tested our predictions by experimentally manipulating lionfish densities above or below reef-specific thresholds, and monitoring the consequences for native fish populations on 24 Bahamian patch reefs over 18 months. We found that reducing lionfish below predicted threshold densities effectively protected native fish community biomass from predation-induced declines. Reductions in density of 25–92%, depending on the reef, were required to suppress lionfish below levels predicted to overconsume prey. On reefs where lionfish were kept below threshold densities, native prey fish biomass increased by 50–70%. Gains in small (15 cm total length), including ecologically important grazers and economically important fisheries species, had increased by 10–65% by the end of the experiment. Crucially, similar gains in prey fish biomass were realized on reefs subjected to partial and full removal of lionfish, but partial removals took 30% less time to implement. By contrast, the biomass of small native fishes declined by >50% on all reefs with lionfish densities exceeding reef-specific thresholds. Large inter-reef variation in the biomass of prey fishes at the outset of the study, which influences the threshold density of lionfish

  6. Propensity scores-potential outcomes framework to incorporate severity probabilities in the highway safety manual crash prediction algorithm.

    Science.gov (United States)

    Sasidharan, Lekshmi; Donnell, Eric T

    2014-10-01

    Accurate estimation of the expected number of crashes at different severity levels for entities with and without countermeasures plays a vital role in selecting countermeasures in the framework of the safety management process. The current practice is to use the American Association of State Highway and Transportation Officials' Highway Safety Manual crash prediction algorithms, which combine safety performance functions and crash modification factors, to estimate the effects of safety countermeasures on different highway and street facility types. Many of these crash prediction algorithms are based solely on crash frequency, or assume that severity outcomes are unchanged when planning for, or implementing, safety countermeasures. Failing to account for the uncertainty associated with crash severity outcomes, and assuming crash severity distributions remain unchanged in safety performance evaluations, limits the utility of the Highway Safety Manual crash prediction algorithms in assessing the effect of safety countermeasures on crash severity. This study demonstrates the application of a propensity scores-potential outcomes framework to estimate the probability distribution for the occurrence of different crash severity levels by accounting for the uncertainties associated with them. The probability of fatal and severe injury crash occurrence at lighted and unlighted intersections is estimated in this paper using data from Minnesota. The results show that the expected probability of occurrence of fatal and severe injury crashes at a lighted intersection was 1 in 35 crashes and the estimated risk ratio indicates that the respective probabilities at an unlighted intersection was 1.14 times higher compared to lighted intersections. The results from the potential outcomes-propensity scores framework are compared to results obtained from traditional binary logit models, without application of propensity scores matching. Traditional binary logit analysis suggests that

  7. Performance of Machine Learning Algorithms for Qualitative and Quantitative Prediction Drug Blockade of hERG1 channel.

    Science.gov (United States)

    Wacker, Soren; Noskov, Sergei Yu

    2018-05-01

    Drug-induced abnormal heart rhythm known as Torsades de Pointes (TdP) is a potential lethal ventricular tachycardia found in many patients. Even newly released anti-arrhythmic drugs, like ivabradine with HCN channel as a primary target, block the hERG potassium current in overlapping concentration interval. Promiscuous drug block to hERG channel may potentially lead to perturbation of the action potential duration (APD) and TdP, especially when with combined with polypharmacy and/or electrolyte disturbances. The example of novel anti-arrhythmic ivabradine illustrates clinically important and ongoing deficit in drug design and warrants for better screening methods. There is an urgent need to develop new approaches for rapid and accurate assessment of how drugs with complex interactions and multiple subcellular targets can predispose or protect from drug-induced TdP. One of the unexpected outcomes of compulsory hERG screening implemented in USA and European Union resulted in large datasets of IC 50 values for various molecules entering the market. The abundant data allows now to construct predictive machine-learning (ML) models. Novel ML algorithms and techniques promise better accuracy in determining IC 50 values of hERG blockade that is comparable or surpassing that of the earlier QSAR or molecular modeling technique. To test the performance of modern ML techniques, we have developed a computational platform integrating various workflows for quantitative structure activity relationship (QSAR) models using data from the ChEMBL database. To establish predictive powers of ML-based algorithms we computed IC 50 values for large dataset of molecules and compared it to automated patch clamp system for a large dataset of hERG blocking and non-blocking drugs, an industry gold standard in studies of cardiotoxicity. The optimal protocol with high sensitivity and predictive power is based on the novel eXtreme gradient boosting (XGBoost) algorithm. The ML-platform with XGBoost

  8. Spatiotemporal prediction of continuous daily PM2.5 concentrations across China using a spatially explicit machine learning algorithm

    Science.gov (United States)

    Zhan, Yu; Luo, Yuzhou; Deng, Xunfei; Chen, Huajin; Grieneisen, Michael L.; Shen, Xueyou; Zhu, Lizhong; Zhang, Minghua

    2017-04-01

    A high degree of uncertainty associated with the emission inventory for China tends to degrade the performance of chemical transport models in predicting PM2.5 concentrations especially on a daily basis. In this study a novel machine learning algorithm, Geographically-Weighted Gradient Boosting Machine (GW-GBM), was developed by improving GBM through building spatial smoothing kernels to weigh the loss function. This modification addressed the spatial nonstationarity of the relationships between PM2.5 concentrations and predictor variables such as aerosol optical depth (AOD) and meteorological conditions. GW-GBM also overcame the estimation bias of PM2.5 concentrations due to missing AOD retrievals, and thus potentially improved subsequent exposure analyses. GW-GBM showed good performance in predicting daily PM2.5 concentrations (R2 = 0.76, RMSE = 23.0 μg/m3) even with partially missing AOD data, which was better than the original GBM model (R2 = 0.71, RMSE = 25.3 μg/m3). On the basis of the continuous spatiotemporal prediction of PM2.5 concentrations, it was predicted that 95% of the population lived in areas where the estimated annual mean PM2.5 concentration was higher than 35 μg/m3, and 45% of the population was exposed to PM2.5 >75 μg/m3 for over 100 days in 2014. GW-GBM accurately predicted continuous daily PM2.5 concentrations in China for assessing acute human health effects.

  9. High-order accurate numerical algorithm for three-dimensional transport prediction

    Energy Technology Data Exchange (ETDEWEB)

    Pepper, D W [Savannah River Lab., Aiken, SC; Baker, A J

    1980-01-01

    The numerical solution of the three-dimensional pollutant transport equation is obtained with the method of fractional steps; advection is solved by the method of moments and diffusion by cubic splines. Topography and variable mesh spacing are accounted for with coordinate transformations. First estimate wind fields are obtained by interpolation to grid points surrounding specific data locations. Numerical results agree with results obtained from analytical Gaussian plume relations for ideal conditions. The numerical model is used to simulate the transport of tritium released from the Savannah River Plant on 2 May 1974. Predicted ground level air concentration 56 km from the release point is within 38% of the experimentally measured value.

  10. Algorithms for the prediction of retinopathy of prematurity based on postnatal weight gain.

    Science.gov (United States)

    Binenbaum, Gil

    2013-06-01

    Current ROP screening guidelines represent a simple risk model with two dichotomized factors, birth weight and gestational age at birth. Pioneering work has shown that tracking postnatal weight gain, a surrogate for low insulin-like growth factor 1, may capture the influence of many other ROP risk factors and improve risk prediction. Models including weight gain, such as WINROP, ROPScore, and CHOP ROP, have demonstrated accurate ROP risk assessment and a potentially large reduction in ROP examinations, compared to current guidelines. However, there is a need for larger studies, and generalizability is limited in countries with developing neonatal care systems. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Applying Probability Theory for the Quality Assessment of a Wildfire Spread Prediction Framework Based on Genetic Algorithms

    Directory of Open Access Journals (Sweden)

    Andrés Cencerrado

    2013-01-01

    Full Text Available This work presents a framework for assessing how the existing constraints at the time of attending an ongoing forest fire affect simulation results, both in terms of quality (accuracy obtained and the time needed to make a decision. In the wildfire spread simulation and prediction area, it is essential to properly exploit the computational power offered by new computing advances. For this purpose, we rely on a two-stage prediction process to enhance the quality of traditional predictions, taking advantage of parallel computing. This strategy is based on an adjustment stage which is carried out by a well-known evolutionary technique: Genetic Algorithms. The core of this framework is evaluated according to the probability theory principles. Thus, a strong statistical study is presented and oriented towards the characterization of such an adjustment technique in order to help the operation managers deal with the two aspects previously mentioned: time and quality. The experimental work in this paper is based on a region in Spain which is one of the most prone to forest fires: El Cap de Creus.

  12. A Novel Approach for Blast-Induced Flyrock Prediction Based on Imperialist Competitive Algorithm and Artificial Neural Network

    Science.gov (United States)

    Marto, Aminaton; Jahed Armaghani, Danial; Tonnizam Mohamad, Edy; Makhtar, Ahmad Mahir

    2014-01-01

    Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA) and artificial neural network (ANN). For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP) ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches. PMID:25147856

  13. A Novel Approach for Blast-Induced Flyrock Prediction Based on Imperialist Competitive Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Aminaton Marto

    2014-01-01

    Full Text Available Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA and artificial neural network (ANN. For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches.

  14. Prediction of non-canonical polyadenylation signals in human genomic sequences based on a novel algorithm using a fuzzy membership function.

    Science.gov (United States)

    Kamasawa, Masami; Horiuchi, Jun-Ichi

    2009-05-01

    Computational prediction of polyadenylation signals (PASes) is essential for analysis of alternative polyadenylation that plays crucial roles in gene regulations by generating heterogeneity of 3'-UTR of mRNAs. To date, several algorithms that are mostly based on machine learning methods have been developed to predict PASes. Accuracies of predictions by those algorithms have improved significantly for the last decade. However, they are designed primarily for prediction of the most canonical AAUAAA and its common variant AUUAAA whereas other variants have been ignored in their predictions despite recent studies indicating that non-canonical variants of AAUAAA are more important in the polyadenylation process than commonly recognized. Here we present a new algorithm "PolyF" employing fuzzy logic to confer an advance in computational PAS prediction--enable prediction of the non-canonical variants, and improve the accuracies for the canonical A(A/U)UAAA prediction. PolyF is a simple computational algorithm that is composed of membership functions defining sequence features of downstream sequence element (DSE) and upstream sequence element (USE), together with an inference engine. As a result, PolyF successfully identified the 10 single-nucleotide variants with approximately the same or higher accuracies compared to those for A(A/U)UAAA. PolyF also achieved higher accuracies for A(A/U)UAAA prediction than those by commonly known PAS finder programs, Polyadq and Erpin. Incorporating the USE into the PolyF algorithm was found to enhance prediction accuracies for all the 12 PAS hexamers compared to those using only the DSE, suggesting an important contribution of the USE in the polyadenylation process.

  15. DNApi: A De Novo Adapter Prediction Algorithm for Small RNA Sequencing Data.

    Science.gov (United States)

    Tsuji, Junko; Weng, Zhiping

    2016-01-01

    With the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3´ adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3´ adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3´ adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5%) with fast runtime (~2.85 seconds per library) and efficient memory usage (~43 MB on average). In addition to 3´ adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3´ adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were "ready-to-map" small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies.

  16. Research on time series data prediction based on clustering algorithm - A case study of Yuebao

    Science.gov (United States)

    Lu, Xu; Zhao, Tianzhong

    2017-08-01

    Forecasting is the prerequisite for making scientific decisions, it is based on the past information of the research on the phenomenon, and combined with some of the factors affecting this phenomenon, then using scientific methods to forecast the development trend of the future, it is an important way for people to know the world. This is particularly important in the prediction of financial data, because proper financial data forecasts can provide a great deal of help to financial institutions in their strategic implementation, strategic alignment and risk control. However, the current forecasts of financial data generally use the method of forecast of overall data, which lack of consideration of customer behavior and other factors in the financial data forecasting process, and they are important factors influencing the change of financial data. Based on this situation, this paper analyzed the data of Yuebao, and according to the user's attributes and the operating characteristics, this paper classified 567 users of Yuebao, and made further predicted the data of Yuebao for every class of users, the results showed that the forecasting model in this paper can meet the demand of forecasting.

  17. Short-term prediction of rain attenuation level and volatility in Earth-to-Satellite links at EHF band

    Directory of Open Access Journals (Sweden)

    L. de Montera

    2008-08-01

    Full Text Available This paper shows how nonlinear models originally developed in the finance field can be used to predict rain attenuation level and volatility in Earth-to-Satellite links operating at the Extremely High Frequencies band (EHF, 20–50 GHz. A common approach to solving this problem is to consider that the prediction error corresponds only to scintillations, whose variance is assumed to be constant. Nevertheless, this assumption does not seem to be realistic because of the heteroscedasticity of error time series: the variance of the prediction error is found to be time-varying and has to be modeled. Since rain attenuation time series behave similarly to certain stocks or foreign exchange rates, a switching ARIMA/GARCH model was implemented. The originality of this model is that not only the attenuation level, but also the error conditional distribution are predicted. It allows an accurate upper-bound of the future attenuation to be estimated in real time that minimizes the cost of Fade Mitigation Techniques (FMT and therefore enables the communication system to reach a high percentage of availability. The performance of the switching ARIMA/GARCH model was estimated using a measurement database of the Olympus satellite 20/30 GHz beacons and this model is shown to outperform significantly other existing models.

    The model also includes frequency scaling from the downlink frequency to the uplink frequency. The attenuation effects (gases, clouds and rain are first separated with a neural network and then scaled using specific scaling factors. As to the resulting uplink prediction error, the error contribution of the frequency scaling step is shown to be larger than that of the downlink prediction, indicating that further study should focus on improving the accuracy of the scaling factor.

  18. Short-term prediction of rain attenuation level and volatility in Earth-to-Satellite links at EHF band

    Science.gov (United States)

    de Montera, L.; Mallet, C.; Barthès, L.; Golé, P.

    2008-08-01

    This paper shows how nonlinear models originally developed in the finance field can be used to predict rain attenuation level and volatility in Earth-to-Satellite links operating at the Extremely High Frequencies band (EHF, 20 50 GHz). A common approach to solving this problem is to consider that the prediction error corresponds only to scintillations, whose variance is assumed to be constant. Nevertheless, this assumption does not seem to be realistic because of the heteroscedasticity of error time series: the variance of the prediction error is found to be time-varying and has to be modeled. Since rain attenuation time series behave similarly to certain stocks or foreign exchange rates, a switching ARIMA/GARCH model was implemented. The originality of this model is that not only the attenuation level, but also the error conditional distribution are predicted. It allows an accurate upper-bound of the future attenuation to be estimated in real time that minimizes the cost of Fade Mitigation Techniques (FMT) and therefore enables the communication system to reach a high percentage of availability. The performance of the switching ARIMA/GARCH model was estimated using a measurement database of the Olympus satellite 20/30 GHz beacons and this model is shown to outperform significantly other existing models. The model also includes frequency scaling from the downlink frequency to the uplink frequency. The attenuation effects (gases, clouds and rain) are first separated with a neural network and then scaled using specific scaling factors. As to the resulting uplink prediction error, the error contribution of the frequency scaling step is shown to be larger than that of the downlink prediction, indicating that further study should focus on improving the accuracy of the scaling factor.

  19. CATHEDRAL: a fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures.

    Directory of Open Access Journals (Sweden)

    Oliver C Redfern

    2007-11-01

    Full Text Available We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure-based method (using graph theory to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these

  20. Reconstruction of DNA sequences using genetic algorithms and cellular automata: towards mutation prediction?

    Science.gov (United States)

    Mizas, Ch; Sirakoulis, G Ch; Mardiris, V; Karafyllidis, I; Glykos, N; Sandaltzopoulos, R

    2008-04-01

    Change of DNA sequence that fuels evolution is, to a certain extent,