WorldWideScience

Sample records for association rule mining

  1. Association Rule Mining in Distributed Environment

    OpenAIRE

    Mrs. V. C. Kulloli (Guide); O.A. Omble

    2014-01-01

    Association rule mining is an important term in data mining. Association rule mining generates important rules from the data. These rules are called frequent rules and the whole concept is known as frequent rule mining. Earlier this technique was used to be implemented at local machines to generate rule. But when the data size increases as transaction on data increases then local machines took large time to compute the frequent rules. To reduce the time, local machines started...

  2. A RECENT REVIEW ON ASSOCIATION RULE MINING

    OpenAIRE

    Maragatham G; Lakshmi M.

    2011-01-01

    Recently more encroachment has emerged in the field of data mining. One of the hottest topic in this area is mining for hidden patterns from the existing massive collection of databases. The knowledge obtained from these databases are used for different applications like super market sales-prediction, fraud detection etc. In thisarticle, the various advancements in data mining using the association rule mining is discussed. The role of Association rules in temporal mining, utility mining, sta...

  3. ASSOCIATION RULE MINING IN ECOMMERCE: A SURVEY

    Directory of Open Access Journals (Sweden)

    Venkateswari S,

    2011-04-01

    Full Text Available Association Rule mining is one of the most popular data mining techniques which can be defined as extracting the interesting correlation and relation among large volume of transactions. E-commerce applications generate huge amount of operational and behavioral data. Applying association rule mining in e-commerce application can unearth the hidden knowledge from these data. In this paper a survey of association rule mining and its various applications in e-commerce environment are made.

  4. Improved Apriori Algorithm for Mining Association Rules

    OpenAIRE

    Darshan M. Tank

    2014-01-01

    Association rules are the main technique for data mining. Apriori algorithm is a classical algorithm of association rule mining. Lots of algorithms for mining association rules and their mutations are proposed on basis of Apriori algorithm, but traditional algorithms are not efficient. For the two bottlenecks of frequent itemsets mining: the large multitude of candidate 2- itemsets, the poor efficiency of counting their support. Proposed algorithm reduces one redundant pruning operations of C...

  5. A REVIEW ON ASSOCIATION RULE MINING ALGORITHMS

    OpenAIRE

    JYOTI ARORA, NIDHI BHALLA, SANJEEV RAO

    2013-01-01

    In this paper, a review of four different association rule mining algorithmsApriori, AprioriTid,Apriori hybrid and tertius algorithms and their drawbacks which would be helpful to find new solution for the Problems found in these algorithms and also presents a comparison between different association mining algorithms. Association rule mining is the one of the most important technique of the data mining. Its aim is to extract interesting correlations, frequent patterns and association among s...

  6. Association Rule Mining and Its Application

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Several algorithms in data mining technique have been studied recently, among which association is one of the most important techniques. In this paper, we introduce theory of association rule in data mining, and analyze the characteristics of postal EMS service. We create a data warehouse model for EMS services and give the procedure of applying association rule mining based on it. In the end, we give an example of the whole mining procedure. This EMS-Data warehouse model and association rule mining technique have been applied in a practical Postal CRM System.

  7. A Collaborative Educational Association Rule Mining Tool

    Science.gov (United States)

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; de Castro, Carlos

    2011-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the ongoing improvement of e-learning courses and allowing teachers with similar course profiles to share and score the discovered information. The mining tool is oriented to be used by non-expert instructors in data mining so its internal…

  8. Efficient Mining of Intertransaction Association Rules

    NARCIS (Netherlands)

    Tung, A.K.H.; Lu, H.J.; Han, J.W.; Feng, L.

    2003-01-01

    Most of the previous studies on mining association rules are on mining intratransaction associations, i.e., the associations among items within the same transaction where the notion of the transaction could be the items bought by the same customer, the events happened on the same day, etc. In this s

  9. MINING ASSOCIATION RULES FROM XML DOCUMENT

    OpenAIRE

    Neha M. Shroff; G. V. Gujar

    2014-01-01

    In this work we describe an approach to mine Tree-based association rules from XML documents. Such rules provide information on both the structure and the content of XML documents; moreover, they can be stored in XML format to be queried later on. The mined knowledge is approximate, intensional knowledge used to provide: (i) quick, approximate answers to queries and (ii) information about structural regularities that can be used as dataguides for document querying. A prototype of the proposed...

  10. Support-Less Association Rule Mining Using Tuple Count Cube

    OpenAIRE

    Qin Ding; William Perrizo

    2007-01-01

    Association rule mining is one of the important tasks in data mining and knowledge discovery (KDD). The traditional task of association rule mining is to find all the rules with high support and high confidence. In some applications, we are interested in finding high confidence rules even though the support may be low. This type of problem differs from the traditional association rule mining problem; hence, it is called support-less association rule mining. Existing algorithms for association...

  11. Association Rule Mining for Web Recommendation

    Directory of Open Access Journals (Sweden)

    R. Suguna

    2012-10-01

    Full Text Available Web usage mining is the application of web mining to discover the useful patterns from the web in order to understand and analyze the behavior of the web users and web based applications. It is theemerging research trend for today’s researchers. It entirely deals with web log files which contain the user website access information. It is an interesting thing to analyze and understand the user behaviorabout the web access. Web usage mining normally has three categories: 1. Preprocessing, 2. Pattern Discovery and 3. Pattern Analysis. This paper proposes the association rule mining algorithms for betterWeb Recommendation and Web Personalization. Web recommendation systems are considered as an important role to understand customers’ behavior, interest, improving customer convenience, increasingservice provider profits and future needs.

  12. Reduction of Negative Rules in Association Rule Mining Using Distance Security and Genetic Algorithm

    OpenAIRE

    Girish Kumar Ameta; Chhavi Saxena

    2013-01-01

    The increasing rate of data is a challenging task for mined useful association rule in data mining. The classical association rulemining generate rule with various problem such as pruning pass of transaction database, negative rule generation and superiority of rule set. Time to time various researchers modified classical association rule mining with different approach. But in current scenario association rule mining suffered from superiority rule generation. The problem of superiority is sol...

  13. Mining Hesitation Information by Vague Association Rules

    Science.gov (United States)

    Lu, An; Ng, Wilfred

    In many online shopping applications, such as Amazon and eBay, traditional Association Rule (AR) mining has limitations as it only deals with the items that are sold but ignores the items that are almost sold (for example, those items that are put into the basket but not checked out). We say that those almost sold items carry hesitation information, since customers are hesitating to buy them. The hesitation information of items is valuable knowledge for the design of good selling strategies. However, there is no conceptual model that is able to capture different statuses of hesitation information. Herein, we apply and extend vague set theory in the context of AR mining. We define the concepts of attractiveness and hesitation of an item, which represent the overall information of a customer's intent on an item. Based on the two concepts, we propose the notion of Vague Association Rules (VARs). We devise an efficient algorithm to mine the VARs. Our experiments show that our algorithm is efficient and the VARs capture more specific and richer information than do the traditional ARs.

  14. Implementation of the Apriori algorithm for association rule mining

    OpenAIRE

    HarvinderChauhan; AnuChauhan

    2014-01-01

    With massive amounts of data continuously being collected and stored, many industries are becoming interested in mining association rules from their databases. The discovery of interesting association relationships among huge amounts of business transaction records can help in many business decision mak ing processes. Association rule mining contains some set of algorithms, whenever we mine the rules we have to use the algorithms. Weka, a software tool for data mining tasks con...

  15. Performance Analysis of Genetic Algorithm for Mining Association Rules

    OpenAIRE

    Indira, K.; Kanmani, S.

    2012-01-01

    Association rule (AR) mining is a data mining task that attempts to discover interesting patterns or relationships between data in large databases. Genetic algorithm (GA) based on evolution principles has found its strong base in mining ARs. This paper analyzes the performance of GA in Mining ARs effectively based on the variations and modification in GA parameters. The recent works in the past seven years for mining association rules using genetic algorithm is considered for the analysis. Ge...

  16. Efficient Mining of Association Rules in Oscillatory-based Data

    OpenAIRE

    Mohammad Saniee Abadeh & Mojtaba Ala

    2011-01-01

    Association rules are one of the most researched areas of data mining. Finding frequent patternsis an important step in association rules mining which is very time consuming and costly. In thispaper, an effective method for mining association rules in the data with the oscillatory value (up,down) is presented, such as the stock price variation in stock exchange, which, just a fewnumbers of the counts of itemsets are searched from the database, and the counts of the rest ofitemsets are compute...

  17. Optimization of Association Rule Mining Apriori Algorithm Using ACO

    OpenAIRE

    Badri Patel; Vijay K Chaudhari; Rajneesh K Karan; YK Rana

    2011-01-01

    Association rule mining is an important topic in data mining field. In a given large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. Apriori algorithm that generates all significant association rules between items in the database. On the basis of the association rule mining and Apriori algorithm, this paper proposes an improved algorithm based on the Ant Colony Optimization algorithm. We can optimize the result generated by Apriori al...

  18. Research on spatial association rules mining in two-direction

    Institute of Scientific and Technical Information of China (English)

    XUE Li-xia; WANG Zuo-cheng

    2007-01-01

    In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.

  19. Association Rule Mining Based On Trade List

    CERN Document Server

    Shaikh, Sanober

    2012-01-01

    In this paper a new mining algorithm is defined based on frequent item set. Apriori Algorithm scans the database every time when it finds the frequent item set so it is very time consuming and at each step it generates candidate item set. So for large databases it takes lots of space to store candidate item set .In undirected item set graph, it is improvement on apriori but it takes time and space for tree generation. The defined algorithm scans the database at the start only once and then from that scanned data base it generates the Trade List. It contains the information of whole database. By considering minimum support it finds the frequent item set and by considering the minimum confidence it generates the association rule. If database and minimum support is changed, the new algorithm finds the new frequent items by scanning Trade List. That is why it's executing efficiency is improved distinctly compared to traditional algorithm.

  20. Efficient mining of association rules based on gravitational search algorithm

    Directory of Open Access Journals (Sweden)

    Fariba Khademolghorani

    2011-07-01

    Full Text Available Association rules mining are one of the most used tools to discover relationships among attributes in a database. A lot of algorithms have been introduced for discovering these rules. These algorithms have to mine association rules in two stages separately. Most of them mine occurrence rules which are easily predictable by the users. Therefore, this paper discusses the application of gravitational search algorithm for discovering interesting association rules. This evolutionary algorithm is based on the Newtonian gravity and the laws of motion. Furthermore, contrary to the previous methods, the proposed method in this study is able to mine the best association rules without generating frequent itemsets and is independent of the minimum support and confidence values. The results of applying this method in comparison with the method of mining association rules based upon the particle swarm optimization show that our method is successful.

  1. Mining association rule efficiently based on data warehouse

    Institute of Scientific and Technical Information of China (English)

    陈晓红; 赖邦传; 罗铤

    2003-01-01

    The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) it should be the minimal and the simplest association rule set; 2) its predictive power should in no way be weaker than that of the complete association rule set so that the precision of the association rule set analysis can be guaranteed.By adopting the least association rule set, the pruning of weak rules can be effectively carried out so as to greatly reduce the number of frequent itemset, and therefore improve the mining efficiency. Finally, based on the classical Apriori algorithm, the upward closure property of weak rules is utilized to develop a corresponding efficient algorithm.

  2. Efficient Analysis of Pattern and Association Rule Mining Approaches

    OpenAIRE

    Thabet Slimani; Amor Lazzez

    2014-01-01

    The process of data mining produces various patterns from a given data source. The most recognized data mining tasks are the process of discovering frequent itemsets, frequent sequential patterns, frequent sequential rules and frequent association rules. Numerous efficient algorithms have been proposed to do the above processes. Frequent pattern mining has been a focused topic in data mining research with a good number of references in literature and for that reason an important progress has...

  3. A Novel Approach for Association Rule Mining using Pattern Generation

    OpenAIRE

    Deepa S. Deshpande

    2014-01-01

    Data mining has become a process of significant interest in recent years due to explosive rate of the accumulation of data. It is used to discover potentially valuable implicit knowledge from the large transactional databases. Association rule mining is one of the well known techniques of data mining. It typically aims at discovering associations between attributes in the large databases. The first and the most influential traditional algorithm for association rule discovery is Apriori. Multi...

  4. Mining association rules using formal concept analysis

    OpenAIRE

    Pasquier, Nicolas

    2000-01-01

    In this paper, we give an overview of the use of Formal Concept Analysis in the framework of association rule extraction. Using frequent closed itemsets and their generators, that are defined using the Galois closure operator, we address two major problems: response times of association rule extraction and the relevance and usefulness of discovered association rules. We quickly review the Close and the A-Close algorithms for extracting frequent closed itemsets using their generators that redu...

  5. Boosting association rule mining in large datasets via Gibbs sampling.

    Science.gov (United States)

    Qian, Guoqi; Rao, Calyampudi Radhakrishna; Sun, Xiaoying; Wu, Yuehua

    2016-05-01

    Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this paper, we develop a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. Also a general rule importance measure is proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. In the simulation study and a real genomic data example, we show how to boost association rule mining by an integrated use of the stochastic search and the Apriori algorithm. PMID:27091963

  6. Multi-Level Association Rule Mining: A Review

    Directory of Open Access Journals (Sweden)

    Dr. Jyoti

    2013-12-01

    Full Text Available Association rule mining is the most popular technique in the area of data mining. The main task of this technique is to find the frequent patterns by using minimum support thresholds decided by the user. The Apriori algorithm is a classical algorithm among association rule mining techniques. This algorithm is inefficient because it scans the database many times. Second, if the database is large, it takes too much time to scan the database. For many cases, it is difficult to discover association rules among the objects at low levels of abstraction. Association rules among various item sets of databases can be found at various levels of abstraction. Apriori algorithm does not mine the data on multiple levels of abstraction. Many algorithms in literature discussed this problem. This paper presents the survey on multi-level association rules and mining algorithms.

  7. Sampling based Association Rules Mining- A Recent Overview

    OpenAIRE

    V.Umarani,; Dr.M.Punithavalli

    2010-01-01

    Association rule discovery from large databases is one of the tedious tasks in datamining.The process of frequent itemset mining, the first step in the mining of association rules, is a computational and IO intensive process necessitating repeated passes over the entiredatabase. Sampling has been often suggested as an effective tool to reduce the size of the dataset operated at some cost to accuracy. Data mining literature presents with numerous sampling based approaches to speed up the proce...

  8. Mining association rule bases from integrated genomic data and annotations

    OpenAIRE

    Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

    2008-01-01

    International audience During the last decade, several clustering and association rule mining techniques have been applied to identify groups of co-regulated genes in gene expression data. Nowadays, integrating biological knowledge and gene expression data into a single framework has become a major challenge to improve the relevance of mined patterns and simplify their interpretation by the biologists. The GenMiner approach was developed for mining association rules showing gene groups tha...

  9. Secure Mining of Association Rules in Horizontally Distributed Databases

    OpenAIRE

    Sonal Patil; Harshad Patil

    2014-01-01

    We propose a protocol for secure mining of association rules in horizontally distributed databases. Our protocol is optimized than the Fast Distributed Mining (FDM) algorithm which is an unsecured distributed version of the Apriori algorithm. The main purpose of our protocol is to remove the problem of mining generalized association rules that affects the existing system. Our protocol offers more enhanced privacy with respect to previous protocols. In addition, it is simpler and is optimized ...

  10. Compact Weighted Class Association Rule Mining using Information Gain

    CERN Document Server

    Ibrahim, S P Syed

    2011-01-01

    Weighted association rule mining reflects semantic significance of item by considering its weight. Classification constructs the classifier and predicts the new data instance. This paper proposes compact weighted class association rule mining method, which applies weighted association rule mining in the classification and constructs an efficient weighted associative classifier. This proposed associative classification algorithm chooses one non class informative attribute from dataset and all the weighted class association rules are generated based on that attribute. The weight of the item is considered as one of the parameter in generating the weighted class association rules. This proposed algorithm calculates the weight using the HITS model. Experimental results show that the proposed system generates less number of high quality rules which improves the classification accuracy.

  11. Efficient Mining of Association Rules in Oscillatory-based Data

    OpenAIRE

    Mohammad Saniee Abadeh; Mojtaba Ala

    2011-01-01

    Association rules are one of the most researched areas of data mining. Finding frequent patterns is an important step in association rules mining which is very time consuming and costly. In this paper, an effective method for mining association rules in the data with the oscillatory value (up, down) is presented, such as the stock price variation in stock exchange, which, just a few numbers of the counts of itemsets are searched from the database, and the counts of the rest of itemsets are co...

  12. Association Rule mining using Apriori Algorithm: A Review

    OpenAIRE

    Manisha Bhargava; Arvind Selwal

    2013-01-01

    Data mining or knowledge discovery is the process of discovering patterns in large data sets. In data mining each algorithm has adifferent objective and to obtain meaningful and previously unknown patterns from large dataset is an emerging and challenging problem.Association rule mining is a technique for discovering unsuspected data dependencies and is one of the best known data mining techniques. The basic Idea to identify from a given database, consisting of item sets (e.g. shopping basket...

  13. An Optimized Weighted Association Rule Mining On Dynamic Content

    CERN Document Server

    Velvadivu, P

    2010-01-01

    Association rule mining aims to explore large transaction databases for association rules. Classical Association Rule Mining (ARM) model assumes that all items have the same significance without taking their weight into account. It also ignores the difference between the transactions and importance of each and every itemsets. But, the Weighted Association Rule Mining (WARM) does not work on databases with only binary attributes. It makes use of the importance of each itemset and transaction. WARM requires each item to be given weight to reflect their importance to the user. The weights may correspond to special promotions on some products, or the profitability of different items. This research work first focused on a weight assignment based on a directed graph where nodes denote items and links represent association rules. A generalized version of HITS is applied to the graph to rank the items, where all nodes and links are allowed to have weights. This research then uses enhanced HITS algorithm by developing...

  14. Association rule mining as a support for OLAP

    OpenAIRE

    Chudán, David

    2010-01-01

    The aim of this work is to identify the possibilities of the complementary usage of two analytical methods of data analysis, OLAP analysis and data mining represented by GUHA association rule mining. The usage of these two methods in the context of proposed scenarios on one dataset presumes a synergistic effect, surpassing the knowledge acquired by these two methods independently. This is the main contribution of the work. Another contribution is the original use of GUHA association rules whe...

  15. An Ontological Approach for Mining Association Rules from Transactional Dataset

    Directory of Open Access Journals (Sweden)

    Sivanthiya.T

    2015-01-01

    Full Text Available Infrequent item sets are mined in order to reduce the cost function and to make the sale of a rare data correlated item set. In the past research, algorithms like Infrequent Weighted Item Set Miner and Minimal Infrequent Weighted Item Set Miner were used. Since, mining of infrequent item set is done by satisfying support count less than or equal to the maximum support count many number of rules were generated and the mined result do not guarantee that only interesting rules were extracted, as the interestingness is strongly depends on the user knowledge and goals. Hence, an Ontology Relational Weights Measure using Weighted Association Rule Mining approach is introduced to integrate user’s knowledge, minimize number of rules and mine the interesting infrequent item sets.

  16. Performance Evaluation of Algorithms using a Distributed Data Mining Frame Work based on Association Rule Mining

    OpenAIRE

    P.T.Kavitha; Dr.T.Sasipraba

    2011-01-01

    Numerous current data mining tasks can be implemented effectively only in a distributed data mining. Thus distributed data mining has achieved significant importance in the last decade. The proposed distributed data mining application framework, is a data mining tool. This framework aims at developing an efficient association rule mining tool to support effective decision making. Association Rulemining focuses on finding interesting patterns from huge amount of data available in the data ware...

  17. Mining multilevel spatial association rules with cloud models

    Institute of Scientific and Technical Information of China (English)

    YANG Bin; ZHU Zhong-ying

    2005-01-01

    The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules.Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.

  18. A Survey of Association Rule Mining Using Genetic Algorithm

    OpenAIRE

    Anubha Sharma

    2012-01-01

    Data mining is the analysis step of the "Knowledge Discovery in Databases" process, or KDD. It is the process that results in the discovery of new patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract knowledge from an existing data set and transform it into a human-understandable structure. In data mining, association rule learning is a popu...

  19. IP- Apriori: Improved Pruning in Apriori for Association Rule Mining

    OpenAIRE

    PG Scholar Mr. Prince Verma; Dinesh Kumar

    2013-01-01

    Association rule mining which is of great importance and use is one of a vital technique for data mining. Main among the association rule mining techniques have been Apriori and many more approaches have been introduced with minute changes to Apriori but their basic concept remains the same i.e use of support and confidence threshold(s). According to best of our knowledge we came to know that no work has been done in the field of improving the pruning step of Apriori. This paper introduces a ...

  20. Secure Mining of Association Rules in Horizontally Distributed Databases

    Directory of Open Access Journals (Sweden)

    Sonal Patil

    2014-03-01

    Full Text Available We propose a protocol for secure mining of association rules in horizontally distributed databases. Our protocol is optimized than the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main purpose of our protocol is to remove the problem of mining generalized association rules that affects the existing system. Our protocol offers more enhanced privacy with respect to previous protocols. In addition, it is simpler and is optimized in terms of communication rounds, communication cost and computational cost than other protocols .

  1. Secure Mining of Association Rules in Horizontally Distributed Databases

    Directory of Open Access Journals (Sweden)

    Sonal Patil

    2015-11-01

    Full Text Available We propose a protocol for secure mining of association rules in horizontally distributed databases. Our protocol is optimized than the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main purpose of our protocol is to remove the problem of mining generalized association rules that affects the existing system. Our protocol offers more enhanced privacy with respect to previous protocols. In addition, it is simpler and is optimized in terms of communication rounds, communication cost and computational cost than other protocols.

  2. A Survey of Association Rule Mining Using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Anubha Sharma

    2012-08-01

    Full Text Available Data mining is the analysis step of the "Knowledge Discovery in Databases" process, or KDD. It is the process that results in the discovery of new patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract knowledge from an existing data set and transform it into a human-understandable structure. In data mining, association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time. Genetic algorithm (GA is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms, which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. In previous, many researchers have proposed Genetic Algorithms for mining interesting association rules from quantitative data. In this paper we represent a survey of Association Rule Mining Using Genetic Algorithm. The techniques are categorized based upon different approaches. This paper provides the major advancement in the approaches for association rule mining using genetic algorithms.

  3. Association Rule Mining from an Intelligent Tutor

    Science.gov (United States)

    Dogan, Buket; Camurcu, A. Yilmaz

    2008-01-01

    Educational data mining is a very novel research area, offering fertile ground for many interesting data mining applications. Educational data mining can extract useful information from educational activities for better understanding and assessment of the student learning process. In this way, it is possible to explore how students learn topics in…

  4. An Optimized Weighted Association Rule Mining On Dynamic Content

    Directory of Open Access Journals (Sweden)

    P. Velvadivu

    2010-03-01

    Full Text Available Association rule mining aims to explore large transaction databases for association rules. Classical Association Rule Mining (ARM model assumes that all items have the same significance without taking their weight into account. It also ignores the difference between the transactions and importance of each and every itemsets. But, the Weighted Association Rule Mining (WARM does not work on databases with only binary attributes. It makes use of the importance of each itemset and transaction. WARM requires each item to be given weight to reflect their importance to the user. The weights may correspond to special promotions on some products, or the profitability of different items. This research work first focused on a weight assignment based on a directed graph where nodes denote items and links represent association rules. A generalized version of HITS is applied to the graph to rank the items, where all nodes and links are allowed to have weights. This research then uses enhanced HITS algorithm by developing an online eigenvector calculation method that can compute the results of mutual reinforcement voting in case of frequent updates. For Example in Share Market Shares price may go down or up. So we need to carefully watch the market and our association rule mining has to produce the items that have undergone frequent changes. These are done by estimating the upper bound of perturbation and postponing of the updates whenever possible. Next we prove that enhanced algorithm is more efficient than the original HITS under the context of dynamic data.

  5. APPLYING PARALLEL ASSOCIATION RULE MINING TO HETEROGENEOUS ENVIRONMENT

    Directory of Open Access Journals (Sweden)

    P.Asha

    2013-08-01

    Full Text Available The work aims to discover frequent patterns by generating the candidates and frame the association rules after which filter out only the efficient rules based on various Rule Interestingness measures. As all these require heavy computation, application of complete parallelization to every individual phase would yield better performance. The paper illustrates the system behavior in a heterogeneous environment with both shared memory and distributed memory parallelization while efficiently mining the data.

  6. EVALUATING THE PERFORMANCE OF ASSOCIATION RULE MINING ALGORITHMS

    Directory of Open Access Journals (Sweden)

    K. Vanitha

    2011-07-01

    Full Text Available Association rule mining is one of the most popular data mining methods. However, mining association rules often results in a very large number of found rules, leaving the analyst with the task to go through all the rules and discover interesting ones. In this paper, we present the performance comparison of Apriori and FP-growth algorithms. The performance is analyzed based on the execution time for different number of instances and confidence in Super market data set. These algorithms are presented together with some experimental data. Our performance study shows that the FP-growth method is efficient and scalable and is about an order of magnitude faster than the Apriori algorithm.

  7. Association Rule Mining for Web Recommendation

    OpenAIRE

    R. Suguna; D. Sharmila

    2012-01-01

    Web usage mining is the application of web mining to discover the useful patterns from the web in order to understand and analyze the behavior of the web users and web based applications. It is theemerging research trend for today’s researchers. It entirely deals with web log files which contain the user website access information. It is an interesting thing to analyze and understand the user behaviorabout the web access. Web usage mining normally has three categories: 1. Preprocessing, 2. Pa...

  8. Detection of Attacks on MAODV Association Rule Mining Optimization

    Directory of Open Access Journals (Sweden)

    A. Fidalcastro

    2015-02-01

    Full Text Available Current mining algorithms can generate large number of rules and very slow to generate rules or generate few results, omitting interesting and valuable information. To address this problem, we propose an algorithm Optimized Featured Top Association Rules (OFTAR algorithm, where every attack have many features and some of the features are more important. The Features are selected by genetic algorithm and processed by the OFTAR algorithm to find the optimized rules. The algorithm utilizes Genetic Algorithm feature selection approach to find optimized features. OFTAR incorporate association rules with several rule optimization techniques and expansion techniques to improve efficiency. Increasing popularity of Mobile ad hoc network users of wireless networks lead to threats and attacks on MANET, due to its features. The main challenge in designing a MANET is protecting from various attacks in the network. Intrusion Detection System is required to monitor the network and to detect the malicious node in the network in multi casting mobility environment. The node features are processed in Association Analysis to generate rules, the generated rules are applied to nodes to detect the attacks. Experimental results show that the algorithm has higher scalability and good performance that is an advantageous to several association rule mining algorithms when the rule generation is controlled and optimized to detect the attacks.

  9. Association Rules Mining Using Majority Voting in the Stock Data

    Directory of Open Access Journals (Sweden)

    Mukesh Kumar

    2012-01-01

    Full Text Available A time series data set consists of sequence of values or events that change with time. Stock data mining plays an important role to visualize the behavior of financial market. Every investor wants to know or predict the trends of the stock trading. Association rule mining algorithms can be used to discover all item associations (or rules in a dataset that satisfy user-specified constraints, i.e. minimum support and minimum confidence. The traditional association analysis is intra-transactional because it concerns items within the same transaction. Patterns are evaluated in this paper by means of generating association rules with a majority voting approach. The rules having the same consequent and higher voting are picked up to determine the stock pattern. The experimental results demonstrate notable similar pattern as well as categorization of stocks. The pattern so generated helps investors to build their portfolio and use these patterns to learn more about investment planning and financial market.

  10. Optimizing Mining Association Rules for Artificial Immune System based Classification

    Directory of Open Access Journals (Sweden)

    SAMEER DIXIT

    2011-08-01

    Full Text Available The primary function of a biological immune system is to protect the body from foreign molecules known as antigens. It has great pattern recognition capability that may be used to distinguish between foreigncells entering the body (non-self or antigen and the body cells (self. Immune systems have many characteristics such as uniqueness, autonomous, recognition of foreigners, distributed detection, and noise tolerance . Inspired by biological immune systems, Artificial Immune Systems have emerged during the last decade. They are incited by many researchers to design and build immune-based models for a variety of application domains. Artificial immune systems can be defined as a computational paradigm that is inspired by theoretical immunology, observed immune functions, principles and mechanisms. Association rule mining is one of the most important and well researched techniques of data mining. The goal of association rules is to extract interesting correlations, frequent patterns, associations or casual structures among sets of items in thetransaction databases or other data repositories. Association rules are widely used in various areas such as inventory control, telecommunication networks, intelligent decision making, market analysis and risk management etc. Apriori is the most widely used algorithm for mining the association rules. Other popular association rule mining algorithms are frequent pattern (FP growth, Eclat, dynamic itemset counting (DIC etc. Associative classification uses association rule mining in the rule discovery process to predict the class labels of the data. This technique has shown great promise over many other classification techniques. Associative classification also integrates the process of rule discovery and classification to build the classifier for the purpose of prediction. The main problem with the associative classification approach is the discovery of highquality association rules in a very large space of

  11. Promoter Sequences Prediction Using Relational Association Rule Mining

    OpenAIRE

    Gabriela Czibula; Maria-Iuliana Bocicor; Istvan Gergely Czibula

    2012-01-01

    In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a...

  12. A Novel Approach for Association Rule Mining using Pattern Generation

    Directory of Open Access Journals (Sweden)

    Deepa S. Deshpande

    2014-10-01

    Full Text Available Data mining has become a process of significant interest in recent years due to explosive rate of the accumulation of data. It is used to discover potentially valuable implicit knowledge from the large transactional databases. Association rule mining is one of the well known techniques of data mining. It typically aims at discovering associations between attributes in the large databases. The first and the most influential traditional algorithm for association rule discovery is Apriori. Multiple scans of database, generation of large number of candidates item set and discovery of interesting rules are the main challenging issues for the improvement of Apriori algorithm. Therefore in order to decrease the multiple scanning of database, a new method of association rule mining using pattern generation is proposed in this paper. This method involves three steps. First, patterns are generated using items from the transaction database. Second, frequent item set is obtained using these patterns. Finally association rules are derived. The performance of this method is evaluated with the traditional Apriori algorithm. It shows that behavior of the proposed method is much more similar to Apriori algorithm with less memory space and reduction in multiple times scanning of database. Thus it is more efficient than the traditional Apriori algorithm.

  13. Database Reverse Engineering based on Association Rule Mining

    Directory of Open Access Journals (Sweden)

    Nattapon Pannurat

    2010-03-01

    Full Text Available Maintaining a legacy database is a difficult task especially when system documentation is poor written or even missing. Database reverse engineering is an attempt to recover high-level conceptual design from the existing database instances. In this paper, we propose a technique to discover conceptual schema using the association mining technique. The discovered schema corresponds to the normalization at the third normal form, which is a common practice in many business organizations. Our algorithm also includes the rule filtering heuristic to solve the problem of exponential growth of discovered rules inherited with the association mining technique.

  14. Mining Video Association Rules Based on Weighted Temporal Concepts

    Directory of Open Access Journals (Sweden)

    V.Vijayakumar

    2012-07-01

    Full Text Available Discovery of video association rules has been found useful in many applications to explore the video knowledge such as video indexing, summarization, classification and semantic event detection. The traditional classical association rule mining algorithms can not apply directly to the video database. It differs in two ways such as spatial and temporal properties of the video database and significance of the items in the vide cluster sequence. The proposed paper discovers significant relationships in video sequence using weighted temporal concepts. The weights of the video items take the quality of transactions into considerations using modified link-based models. The proposed Modified HITS based weighted temporal concept did not require pre-assigned weights. The mined association rules have more practical significance. This strategy identifies the valuable rules comparing with Apriori based video sequence algorithm. We also present results of applying these algorithms to a synthetic data set, which show the effectiveness of our algorithm.

  15. Efficient Mining of Association Rules in Oscillatory-based Data

    Directory of Open Access Journals (Sweden)

    Mohammad Saniee Abadeh

    2011-12-01

    Full Text Available Association rules are one of the most researched areas of data mining. Finding frequent patterns is an important step in association rules mining which is very time consuming and costly. In this paper, an effective method for mining association rules in the data with the oscillatory value (up, down is presented, such as the stock price variation in stock exchange, which, just a few numbers of the counts of itemsets are searched from the database, and the counts of the rest of itemsets are computed using the relationships that exist between these types of data. Also, the strategy of pruning is used to decrease the searching space and increase the rate of the mining process. Thus, there is no need to investigate the entire frequent patterns from the database. This takes less time to find frequent patterns. By executing the MR-Miner (an acronym for “Math Rules-Miner” algorithm, its performance on the real stock data is analyzed and shown. Our experiments show that the MR-Miner algorithm can find association rules very efficiently in the data based on Oscillatory value type.

  16. Secure Association Rule Mining for Distributed Level Hierarchy in Web

    OpenAIRE

    Gulshan Shrivastava; Dr. Vishal Bhatnagar

    2011-01-01

    Data mining technology can analyze massive data and it play very important role in many domains, if it used improperly it can also cause some new problem of information security. Thus severalprivacy preserving techniques for association rule mining have also been proposed in the past few years. Various algorithms have been developed for centralized data, while others refer to distributed data scenario. Distributed data Scenarios can also be classified as heterogeneous distributed data and hom...

  17. Mining fuzzy association rules in spatio-temporal databases

    Science.gov (United States)

    Shu, Hong; Dong, Lin; Zhu, Xinyan

    2008-12-01

    A huge amount of geospatial and temporal data have been collected through various networks of environment monitoring stations. For instance, daily precipitation and temperature are observed at hundreds of meteorological stations in Northeastern China. However, these massive raw data from the stations are not fully utilized for meeting the requirements of human decision-making. In nature, the discovery of geographical data mining is the computation of multivariate spatio-temporal correlations through the stages of data mining. In this paper, a procedure of mining association rules in regional climate-changing databases is introduced. The methods of Kriging interpolation, fuzzy cmeans clustering, and Apriori-based logical rules extraction are employed subsequently. Formally, we define geographical spatio-temporal transactions and fuzzy association rules. Innovatively, we make fuzzy data conceptualization by means of fuzzy c-means clustering, and transform fuzzy data items with membership grades into Boolean data items with weights by means ofλ-cut sets. When the algorithm Apriori is executed on Boolean transactions with weights, fuzzy association rules are derived. Fuzzy association rules are more nature than crisp association rules for human cognition about the reality.

  18. Associative Regressive Decision Rule Mining for Predicting Customer Satisfactory Patterns

    Directory of Open Access Journals (Sweden)

    P. Suresh

    2016-04-01

    Full Text Available Opinion mining also known as sentiment analysis, involves cust omer satisfactory patterns, sentiments and attitudes toward entities, products, service s and their attributes. With the rapid development in the field of Internet, potential customer’s provi des a satisfactory level of product/service reviews. The high volume of customer rev iews were developed for product/review through taxonomy-aware processing but, it was di fficult to identify the best reviews. In this paper, an Associative Regression Decisio n Rule Mining (ARDRM technique is developed to predict the pattern for service provider and to improve customer satisfaction based on the review comments. Associative Regression based Decisi on Rule Mining performs two- steps for improving the customer satisfactory level. Initial ly, the Machine Learning Bayes Sentiment Classifier (MLBSC is used to classify the cla ss labels for each service reviews. After that, Regressive factor of the opinion words and Class labels w ere checked for Association between the words by using various probabilistic rules. Based on t he probabilistic rules, the opinion and sentiments effect on customer reviews, are analyzed to arrive at specific set of service preferred by the customers with their review com ments. The Associative Regressive Decision Rule helps the service provider to take decision on imp roving the customer satisfactory level. The experimental results reveal that the Associ ative Regression Decision Rule Mining (ARDRM technique improved the performance in terms of true positive rate, Associative Regression factor, Regressive Decision Rule Generation time a nd Review Detection Accuracy of similar pattern.

  19. The GUHA Method and Mining Association Rules

    Czech Academy of Sciences Publication Activity Database

    Hájek, Petr

    Canada : ICSC Academic Press, 2001 - (Kuncheva, L.), s. 533-539 ISBN 3-906454-26-6. [CIMA 2001. Bangor (GB), 19.06.2001-22.06.2001] R&D Projects: GA MŠk OC 274.001 Grant ostatní: COST(XE) Action 274 TARSKI Institutional research plan: AV0Z1030915 Keywords : GUHA * hypothesis generation * data mining * exploratory data analysis Subject RIV: BA - General Mathematics

  20. An Algorithm for Mining Multidimensional Fuzzy Association Rules

    CERN Document Server

    Khare, Neelu; Pardasani, K R

    2009-01-01

    Multidimensional association rule mining searches for interesting relationship among the values from different dimensions or attributes in a relational database. In this method the correlation is among set of dimensions i.e., the items forming a rule come from different dimensions. Therefore each dimension should be partitioned at the fuzzy set level. This paper proposes a new algorithm for generating multidimensional association rules by utilizing fuzzy sets. A database consisting of fuzzy transactions, the Apriory property is employed to prune the useless candidates, itemsets.

  1. Efficient Mining of Association Rules in Oscillatory-based Data

    Directory of Open Access Journals (Sweden)

    Mohammad Saniee Abadeh & Mojtaba Ala

    2011-12-01

    Full Text Available Association rules are one of the most researched areas of data mining. Finding frequent patternsis an important step in association rules mining which is very time consuming and costly. In thispaper, an effective method for mining association rules in the data with the oscillatory value (up,down is presented, such as the stock price variation in stock exchange, which, just a fewnumbers of the counts of itemsets are searched from the database, and the counts of the rest ofitemsets are computed using the relationships that exist between these types of data. Also, thestrategy of pruning is used to decrease the searching space and increase the rate of the miningprocess. Thus, there is no need to investigate the entire frequent patterns from the database.This takes less time to find frequent patterns. By executing the MR-Miner (an acronym for “MathRules-Miner” algorithm, its performance on the real stock data is analyzed and shown. Ourexperiments show that the MR-Miner algorithm can find association rules very efficiently in thedata based on Oscillatory value type.

  2. Secure Association Rule Mining for Distributed Level Hierarchy in Web

    Directory of Open Access Journals (Sweden)

    Gulshan Shrivastava,

    2011-06-01

    Full Text Available Data mining technology can analyze massive data and it play very important role in many domains, if it used improperly it can also cause some new problem of information security. Thus severalprivacy preserving techniques for association rule mining have also been proposed in the past few years. Various algorithms have been developed for centralized data, while others refer to distributed data scenario. Distributed data Scenarios can also be classified as heterogeneous distributed data and homogenous distributed data and we identify that distributed data could be partitioned as horizontal partition (a.k.a. homogeneous distribution and vertical partition (a.k.a. heterogeneous distribution. In this paper, we propose an algorithm for secure association rule mining for vertical partition.

  3. Aggregate Function Based Enhanced Apriori Algorithm for Mining Association Rules

    Directory of Open Access Journals (Sweden)

    Medhat H A Awadalla

    2012-05-01

    Full Text Available Association rule analysis is the task of discovering association rules that occur frequently in a given transaction data set. Its task is to find certain relationships among a set of data (itemset in the database. It has two measurements: Support and confidence values. Confidence value is a measure of rules strength, while support value corresponds to statistical significance. Traditional association rule mining techniques employ predefined support and confidence values. However, specifying minimum support value of the mined rules in advance often leads to either too many or too few rules, which negatively impacts the performance of the overall system. To replace the Aprori's user defined minimum threshold value, this paper proposes an aggregate function based on Central Limit Theorem CLT that calculates a more meaningful minimum threshold value. The paper also proposes a new function, Specified Minimum Support value function with bit mapping, which calculates a custom minimum support for each item set based on the probability of collision chance of its items. Furthermore, a modification for Apriori algorithm to accommodate this function is proposed. Experiments on large set of data bases have been conducted to validate the proposed framework. The achieved results show that there is a remarkable improvement in the overall performance of the system in terms of run time, the number of generated rules, and number of frequent items used.

  4. ANALYSIS ON SUSPICIOUS THYROID RECOGNITION USING ASSOCIATION RULE MINING

    Directory of Open Access Journals (Sweden)

    K. Saravana Kumar

    2012-10-01

    Full Text Available Thyroid cancer was the most common type of cancer in the country, overtaking gastric cancer for the first time in last year. This paper proposes to apply the association rule mining for suspected thyroid diseases. We apply the model of deception of set of thyroid dataset then applied apriori algorithm to generate the rules. The rules generated are used to test the thyroid as deceptive or not. In particular we are interested in detecting thyroid about critical activities. After classification we must be able to differentiate the thyroid giving information about hyperthyroid, hypothyroid (Informative thyroid and those acting as alerts (warnings for the future critical activities.

  5. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  6. Implementation of Association Rules in Distributed Databases using Privacy Preserving Mining Algorithm

    OpenAIRE

    N. TEJCHAND

    2012-01-01

    Mining of data is used to retrieve necessary data from integrated data sources. Association rule mining is one of the most important applications of data mining and is the major problems in data mining. Recently, in need of security, more and more people are studying privacy- preserving association rules mining in distributed database. This paper addresses a secure mining algorithm of association rules, which builds a global hash table to prune item- sets and incorporate cryptographic techniq...

  7. Classification approach based on association rules mining for unbalanced data

    CERN Document Server

    Ndour, Cheikh

    2012-01-01

    This paper deals with the supervised classification when the response variable is binary and its class distribution is unbalanced. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods that provide classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning because this approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classification rule...

  8. Integrating User's Domain Knowledge with Association Rule Mining

    CERN Document Server

    Singh, Vikram

    2010-01-01

    This paper presents a variation of Apriori algorithm that includes the role of domain expert to guide and speed up the overall knowledge discovery task. Usually, the user is interested in finding relationships between certain attributes instead of the whole dataset. Moreover, he can help the mining algorithm to select the target database which in turn takes less time to find the desired association rules. Variants of the standard Apriori and Interactive Apriori algorithms have been run on artificial datasets. The results show that incorporating user's preference in selection of target attribute helps to search the association rules efficiently both in terms of space and time.

  9. Mining Frequent Generalized Itemsets and Generalized Association Rules Without Redundancy

    Institute of Scientific and Technical Information of China (English)

    Daniel Kunkle; Donghui Zhang; Gene Cooperman

    2008-01-01

    This paper presents some new algorithms to efficiently mine max frequent generalized itemsets (g-itemsets) and essential generalized association rules (g-rules). These are compact and general representations for all frequent patterns and all strong association rules in the generalized environment. Our results fill an important gap among algorithms for frequent patterns and association rules by combining two concepts. First, generalized itemsets employ a taxonomy of items, rather than a fiat list of items. This produces more natural frequent itemsets and associations such as (meat, milk) instead of (beef, milk), (chicken, milk), etc. Second, compact representations of frequent itemsets and strong rules, whose result size is exponentially smaller, can solve a standard dilemma in mining patterns: with small threshold values for support and confidence, the user is overwhelmed by the extraordinary number of identified patterns and associations; but with large threshold values, some interesting patterns and associations fail to be identified. Our algorithms can also expand those max frequent g-itemsets and essential g-rules into the much larger set of ordinary frequent g-itemsets and strong g-rules. While that expansion is not recommended in most practical cases, we do so in order to present a comparison with existing algorithms that only handle ordinary frequent g-itemsets. In this case, the new algorithm is shown to be thousands, and in some cases millions, of the time faster than previous algorithms. Further, the new algorithm succeeds in analyzing deeper taxonomies, with the depths of seven or more. Experimental results for previous algorithms limited themselves to taxonomies with depth at most three or four. In each of the two problems, a straightforward lattice-based approach is briefly discussed and then a classificationbased algorithm is developed. In particular, the two classification-based algorithms are MFGI_class for mining max frequent g-itemsets and EGR

  10. Feasibility study for banking loan using association rule mining classifier

    Directory of Open Access Journals (Sweden)

    Agus Sasmito Aribowo

    2015-03-01

    Full Text Available The problem of bad loans in the koperasi can be reduced if the koperasi can detect whether member can complete the mortgage debt or decline. The method used for identify characteristic patterns of prospective lenders in this study, called Association Rule Mining Classifier. Pattern of credit member will be converted into knowledge and used to classify other creditors. Classification process would separate creditors into two groups: good credit and bad credit groups. Research using prototyping for implementing the design into an application using programming language and development tool. The process of association rule mining using Weighted Itemset Tidset (WIT–tree methods. The results shown that the method can predict the prospective customer credit. Training data set using 120 customers who already know their credit history. Data test used 61 customers who apply for credit. The results concluded that 42 customers will be paying off their loans and 19 clients are decline

  11. A New Parallel Algorithm for Mining Association Rules

    Institute of Scientific and Technical Information of China (English)

    DING Yan-hui; WANG Hong-guo; GAO Ming; GU Jian-jun

    2006-01-01

    Mining association rules from large database is very costly.We develop a parallel algorithm for this task on sharedmemory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP,which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets.Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.

  12. A Novel Pruning Approach for Association Rule Mining

    OpenAIRE

    Lalit Mohan Goyal; M. M. Sufyan Beg; Tanvir Ahmad

    2015-01-01

    The problem of Association rule mining (ARM) can be solved by using Apriori algorithm consisting of 3-steps -Joining, Pruning and Verification. Pruning step plays an important role in eliminating weak candidate itemsets. In this paper, a new pruning step is proposed as an alternate to Apriori’s pruning step. This alternative is depicted as a filtration step. Five experiments are carried out to claim that proposed pruning method also works as efficient as Apriori’s pruning method.

  13. A Privacy Preserving Association Rule Mining Over Unrealized Datasets

    OpenAIRE

    Sunil kumar chintada; JayanthiRaoMadina

    2013-01-01

    In this paper we are proposing an efficient an empirical model of privacy preserving association rule mining approach over data miningwith Boolean matrix approach and security consideration we are using RSA algorithm for Secure data transmission. In this approach we are reducing the time complexity during finding the patterns by theBoolean matrix ,Communication can be done with cipher datasets instead of plain datasets

  14. COLLABORATIVE NETWORK SECURITY MANAGEMENT SYSTEM BASED ON ASSOCIATION MINING RULE

    OpenAIRE

    Nisha Mariam Varughese

    2014-01-01

    Security is one of the major challenges in open network. There are so many types of attacks which follow fixed patterns or frequently change their patterns. It is difficult to find the malicious attack which does not have any fixed patterns. The Distributed Denial of Service (DDoS) attacks like Botnets are used to slow down the system performance. To address such problems Collaborative Network Security Management System (CNSMS) is proposed along with the association mining rule. CNSMS system ...

  15. WEB PERSONALIZATION WITH WEB USAGE MINING TECHNICS AND ASSOCIATION RULES

    Directory of Open Access Journals (Sweden)

    G. Kazeminuri

    2015-10-01

    Full Text Available As amount of information and web development increase considerably, some technics and methods are required to allow efficient access to data and information extraction from them. Extracting useful pattern from worldwide networks that are referred to as web mining is considered as one of the main applications of data mining. The key challenges of web users are exploring websites for finding the relevant information by taking minimum time in an efficient manner. Discovering the hidden knowledge in the manner of interaction in the web is considered as one of the most important technics in web utilization mining. Information overload is one of the main problems in current web and for tackling this problem the web personalization systems are presented that adapts the content and services of a website with user's interests and browsing behavior. Today website personalization is turned into a popular event for web users and it plays a leading role in speed of access and providing users' desirable information. The objective of current article is extracting index based on users' behavior and web personalization using web mining technics based on utilization and association rules. In proposed methods the weighting criteria showing the extent of interest of users to the pages are expressed and a method is presented based on combination of association rules and clustering by perceptron neural network for web personalization. The proposed method simulation results suggest the improvement of precision and coverage criteria with respect to other compared methods.

  16. Study on the Customer targeting using Association Rule Mining

    Directory of Open Access Journals (Sweden)

    Surendiran.R

    2010-10-01

    Full Text Available Data mining is one of the widest area where many researches takes place to mine desired and hidden data. There are many different approaches to find the hidden data. This paper deals with Frequent Pattern growth algorithm which follows association rule concept togroup the required data items. Using this method of mining time can be reduced to a greater extent. This paper contains implementation of a real time system; the implementation is about making a survey on the group of people and their mobile connection’s service providers.End result contains the set of people from a particular age group with their support and confidence for the service provider they have chosen. Based on which any decisions can be made by service providers to enhance their business and attain many customers.

  17. PHARM – Association Rule Mining for Predictive Health

    Science.gov (United States)

    Cheng, Chih-Wen; Martin, Greg S.; Wu, Po-Yen; Wang, May D.

    2016-01-01

    Predictive health is a new and innovative healthcare model that focuses on maintaining health rather than treating diseases. Such a model may benefit from computer-based decision support systems, which provide more quantitative health assessment, enabling more objective advice and action plans from predictive health providers. However, data mining for predictive health is more challenging compared to that for diseases. This is a reason why there are relatively fewer predictive health decision support systems embedded with data mining. The purpose of this study is to research and develop an interactive decision support system, called PHARM, in conjunction with Emory Center for Health Discovery and Well Being (CHDWB®). PHARM adopts association rule mining to generate quantitative and objective rules for health assessment and prediction. A case study results in 12 rules that predict mental illness based on five psychological factors. This study shows the value and usability of the decision support system to prevent the development of potential illness and to prioritize advice and action plans for reducing disease risks.

  18. BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES

    Directory of Open Access Journals (Sweden)

    Amaranatha Reddy P

    2015-11-01

    Full Text Available This research paper proposes an algorithm to find association rules for incremental databases. Most of the transaction databases are often dynamic. Suppose consider super market customers daily purchase transactions. Day to day customer’s behaviour to purchase items may change and new products replace old products. In this scenario static data mining algorithms doesn't make good significance. If an algorithm continuously learns day to day, then we can get most updated knowledge. This is very much helpful in present fast updating world. Famous and benchmarked algorithms for Association rules mining are Apriory and FP- Growth. However, the major drawback in Appriory and FP-Growth is, they must be rebuilt all over again once the original database is changed. Therefore, in this paper we introduce an efficient algorithm called Binary Decision Tree (BDT to process incremental data. To process continuously data we need so much of processing and storage resources. In this algorithm we scan data base only once by which we construct dynamic growing binary tree to find association rules with better performance and optimum storage. We can apply for static data also, but our main intension is to give optimum solution for incremental data.

  19. Association Rules in Horizontally Distributed Databases with Enhanced Secure Mining

    Directory of Open Access Journals (Sweden)

    Sonal Patil

    2014-08-01

    Full Text Available Recent developments in information technology have made possible the collection and analysis of millions of transactions containing personal data. These data include shopping habits, criminal records, medical histories and credit records among others. In the term of distributed database, distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU controlled by a distributed database management system(together sometimes called a distributed database system. It may be stored in multiple computers located in the same physical location or may be dispersed over a network of interconnected computers. A protocol has been proposed for secure mining of association rules in horizontally distributed databases. This protocol is optimized than the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main purpose of this protocol is to remove the problem of mining generalized association rules that affects the existing system. This protocol offers more enhanced privacy with respect to previous protocols. In addition it is simpler and is optimized in terms of communication rounds, communication cost and computational cost than other protocols.

  20. ASSOCIATION RULES IN HORIZONTALLY DISTRIBUTED DATABASES WITH ENHANCED SECURE MINING

    Directory of Open Access Journals (Sweden)

    Sonal Patil

    2015-10-01

    Full Text Available Recent developments in information technology have made possible the collection and analysis of millions of transactions containing personal data. These data include shopping habits, criminal records, medical histories and credit records among others. In the term of distributed database, distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU controlled by a distributed database management system (together sometimes called a distributed database system. It may be stored in multiple computers located in the same physical location or may be dispersed over a network of interconnected computers. A protocol has been proposed for secure mining of association rules in horizontally distributed databases. This protocol is optimized than the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main purpose of this protocol is to remove the problem of mining generalized association rules that affects the existing system. This protocol offers more enhanced privacy with respect to previous protocols. In addition it is simpler and is optimized in terms of communication rounds, communication cost and computational cost than other protocols.

  1. 关联规则挖掘研究述评%Association Rule Mining: A Survey

    Institute of Scientific and Technical Information of China (English)

    贾彩燕; 倪现君

    2003-01-01

    Association rule mining has been one of the most popular data mining subejcts and has a wide range of applicability. In this paper, we first investigate the main approaches for the task of association rule mining, and analyzed the essence of the algorithms. Then we review foundations of assocation rule mining based on the several possible theoretical frameworks for data mining. What's more,we show the open problems in field of the association rule mining and figure out the tendency of its development in recent years.

  2. Microbial genotype–phenotype mapping by class association rule mining

    OpenAIRE

    Tamura, Makio; D'haeseleer, Patrik

    2008-01-01

    Motivation: Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient class association rule mining algorithm, netCAR, in order to extract sets of COGs (clu...

  3. Secure Mining of Association Rules in Horizontally Distributed Databases

    CERN Document Server

    Tassa, Tamir

    2011-01-01

    We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton (TKDE 2004). Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. (PDIS 1996), which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel secure multi-party algorithms --- one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol of Kantarcioglu and Clifton. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost.

  4. An Efficient Approach to Prune Mined Association Rules in Large Databases

    Directory of Open Access Journals (Sweden)

    D. Narmadha

    2011-01-01

    Full Text Available Association rule mining finds interesting associations and/or correlation relationships among large set of data items. However, when the number of association rules become large, it becomes less interesting to the user. It is crucial to help the decision-maker with an efficient postprocessing step in order to select interesting association rules throughout huge volumes of discovered rules. This motivates the need for association analysis. Thus, this paper presents a novel approach to prune mined association rules in large databases. Further, an analysis of different association rule mining techniques for market basket analysis, highlighting strengths of different association rule mining techniques are also discussed. We want to point out potential pitfalls as well as challenging issues need to be addressed by an association rule mining technique. We believe that the results of this approach will help decision maker for making important decisions.

  5. Prediction of users webpage access behaviour using association rule mining

    Indian Academy of Sciences (India)

    R Geetharamani; P Revathy; Shomona G Jacob

    2015-12-01

    Web Usage mining is a technique used to identify the user needs from the web log. Discovering hidden patterns from the logs is an upcoming research area. Association rules play an important role in many web mining applications to detect interesting patterns. However, it generates enormous rules that cause researchers to spend ample time and expertise to discover the really interesting ones. This paper works on the server logs from the MSNBC dataset for the month of September 1999. This research aims at predicting the probable subsequent page in the usage of web pages listed in this data based on their navigating behaviour by using Apriori prefix tree (PT) algorithm. The generated rules were ranked based on the support, confidence and lift evaluation measures. The final predictions revealed that the interestingness of pages mainly depended on the support and lift measure whereas confidence assumed a uniform value among all the pages. It proved that the system guaranteed 100% confidence with the support of 1.3E−05. It revealed that the pages such as Front page, On-air, News, Sports and BBS attracted more interested subsequent users compared to Travel, MSN-News and MSN-Sports which were of less interest.

  6. Integrated Web Recommendation Model with Improved Weighted Association Rule Mining

    Directory of Open Access Journals (Sweden)

    S.A.Sahaaya Arul Mary

    2013-04-01

    Full Text Available World Wide Web plays a significant role in human life. It requires a technological improvement to satisfy the user needs. Web log data is essential for improving the performance of the web. It contains large,heterogeneous and diverse data. Analyzing g the web log data is a tedious process for Web developers, Web designers, technologists and end users. In this work, a new weighted association mining algorithm is developed to identify the best association rules that are useful for web site restructuring and recommendation that reduces false visit and improve users’ navigation behavior. The algorithm finds the frequent item set from a large uncertain database. Frequent scanning of database in each time is the problem with the existing algorithms which leads to complex output set and time consuming process. Theproposed algorithm scans the database only once at the beginning of the process and the generated frequent item sets, which are stored into the database. The evaluation parameters such as support, confidence, lift and number of rules are considered to analyze the performance of proposed algorithm and traditional association mining algorithm. The new algorithm produced best result that helps the developer to restructure their website in a way to meet the requirements of the end user within short time span.

  7. Optimization of Association Rule Mining through Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    RUPALI HALDULAKAR,

    2011-03-01

    Full Text Available Strong rule generation is an important area of data mining. In this paper we design a novel method for generation of strong rule. In which a general Apriori algorithm is used to generate the rules after that we use the optimization techniques. Genetic algorithm is one of the best ways to optimize the rules .In this direction for the optimization of the rule set we design a new fitness function that uses the concept ofsupervised learning then the GA will be able to generate the stronger rule set.

  8. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  9. A SURVEY ON PRIVACY PRESERVING ASSOCIATION RULE MINING

    Directory of Open Access Journals (Sweden)

    K.Sathiyapriya

    2013-03-01

    Full Text Available Businesses share data, outsourcing for specific business problems. Large companies stake a large part of their business on analysis of private data. Consulting firms often handle sensitive third party data as part of client projects. Organizations face great risks while sharing their data. Most of this sharing takes place with little secrecy. It also increases the legal responsibility of the parties involved in the process. So, it is crucial to reliably protect their data due to legal and customer concerns. In this paper, a review of the state-of-the-art methods for privacy preservation is presented. It also analyzes the techniques for privacy preserving association rule mining and points out their merits and demerits. Finally the challenges and directions for future research are discussed.

  10. A NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET

    Directory of Open Access Journals (Sweden)

    Ms. Sanober Shaikh

    2011-09-01

    Full Text Available In this paper a new mining algorithm is defined based on frequent item set. Apriori Algorithm scans the database every time when it finds the frequent item set so it is very time consuming and at each step it generates candidate item set. So for large databases it takes lots of space to store candidate item set. The defined algorithm scans the database at the start only once and then makes the undirected item set graph. From this graph by considering minimum support it finds the frequent item set and by considering the minimum confidence it generates the association rule. If database and minimum support is changed, the new algorithm finds the new frequent items by scanning undirected item set graph. That is why it’s executing efficiency is improved distinctly compared to traditional algorithm.

  11. A Novel Method for Privacy Preserving in Association Rule Mining Based on Genetic Algorithms

    OpenAIRE

    Mohammad Naderi Dehkordi; Kambiz Badie; Ahmad Khadem Zadeh

    2009-01-01

    Extracting of knowledge form large amount of data is an important issue in data mining systems. One of most important activities in data mining is association rule mining and the new head for data mining research area is privacy of mining. Today association rule mining has been a hot research topic in Data Mining and security area. A lot of research has done in this area but most of them focused on perturbation of original database heuristically. Therefore the final accuracy of released datab...

  12. Fast rule-based bioactivity prediction using associative classification mining

    OpenAIRE

    Yu Pulan; Wild David J

    2012-01-01

    Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, class...

  13. SQL Based Association Rule Mining%基于SQL的关联规则挖掘

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    Data mining is becoming increasingly important since the size of database grows even larger and the need to explore hidden rules from the database becomes widely recognized. Currently database systems are dominated by relational database and the ability to perform data mining using standard SQL queries will definitely ease implementation of data mining. In this paper ,we introduce an association rule mining algorithm based on Apriori and the implementation using SQL. At the end of the paper ,we summarize the paper.

  14. A Conformity Measure using Background Knowledge for Association Rules: Application to Text Mining

    OpenAIRE

    Cherfi, Hacène; Napoli, Amedeo; Toussaint, Yannick

    2009-01-01

    A text mining process using association rules generates a very large number of rules. According to experts of the domain, most of these rules basically convey a common knowledge, i.e. rules which associate terms that experts may likely relate to each other. In order to focus on the result interpretation and discover new knowledge units, it is necessary to define criteria for classifying the extracted rules. Most of the rule classification methods are based on numerical quality measures. In th...

  15. Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps

    OpenAIRE

    An, Li; Obradovic, Zoran; Smith, Desmond; Bodenreider, Olivier; Megalooikonomou, Vasileios

    2009-01-01

    Association rules mining methods have been recently applied to gene expression data analysis to reveal relationships between genes and different conditions and features. However, not much effort has focused on detecting the relation between gene expression maps and related gene functions. Here we describe such an approach to mine association rules among gene functions in clusters of similar gene expression maps on mouse brain. The experimental results show that the detected association rules ...

  16. A Fast Distributed Algorithm for Association Rule Mining Based on Binary Coding Mapping Relation

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining algorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.

  17. Action Rules Mining

    CERN Document Server

    Dardzinska, Agnieszka

    2013-01-01

    We are surrounded by data, numerical, categorical and otherwise, which must to be analyzed and processed to convert it into information that instructs, answers or aids understanding and decision making. Data analysts in many disciplines such as business, education or medicine, are frequently asked to analyze new data sets which are often composed of numerous tables possessing different properties. They try to find completely new correlations between attributes and show new possibilities for users.   Action rules mining discusses some of data mining and knowledge discovery principles and then describe representative concepts, methods and algorithms connected with action. The author introduces the formal definition of action rule, notion of a simple association action rule and a representative action rule, the cost of association action rule, and gives a strategy how to construct simple association action rules of a lowest cost. A new approach for generating action rules from datasets with numerical attributes...

  18. Research on Algorithm for Mining Negative Association Rules Based on Frequent Pattern Tree

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP-tree is presented to discover negative association rules.

  19. An Efficient Association Rule Hiding Algorithm for Privacy Preserving Data Mining

    Directory of Open Access Journals (Sweden)

    Yogendra Kumar Jain,

    2011-07-01

    Full Text Available The security of the large database that contains certain crucial information, it will become a serious issue when sharing data to the network against unauthorized access. Privacy preserving data mining is a new research trend in privacy data for data mining and statistical database. Association analysis is a powerful toolfor discovering relationships which are hidden in large database. Association rules hiding algorithms get strong and efficient performance for protecting confidential and crucial data. Data modification and rule hiding is one of the most important approaches for secure data. The objective of the proposed Association rule hiding algorithm for privacy preserving data mining is to hide certain information so that they cannot be discovered through association rule mining algorithm. The main approached of association rule hiding algorithms to hide some generated association rules, by increase or decrease the support or the confidence of the rules. The association rule items whether in Left Hand Side (LHS or Right Hand Side (RHS of the generated rule, that cannot be deduced through association rule mining algorithms. The concept of IncreaseSupport of Left Hand Side (ISL algorithm is decrease the confidence of rule by increase the support value of LHS. It doesn’t work for both side of rule; it works only for modification of LHS. In Decrease Support of Right Hand Side (DSR algorithm, confidence of the rule decrease by decrease the support value of RHS. It works for the modification of RHS. We proposed a new algorithm solves the problem of them. That can increase and decrease the support of the LHS and RHS item of the rule correspondingly so that more rule hide less number of modification. The efficiency of the proposed algorithm is compared with ISL algorithms and DSR algorithms using real databases, on the basis of number of rules hide, CPU time and the number ofmodifies entries and got better results.

  20. GenMiner: mining informative association rules from genomic data

    OpenAIRE

    Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

    2007-01-01

    International audience GENMINER is a smart adaptation of closed itemsets based association rules extraction to genomic data. It takes advantage of the novel NORDI discretization method and of the JCLOSE algorithm to efficiently generate minimal non-redundant association rules. GENMINER facilitates the integration of numerous sources of biological information such as gene expressions and annotations, and can tacitly integrate qualitative information on biological conditions (age, sex, etc.)....

  1. RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

    Science.gov (United States)

    Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-01-01

    Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported. PMID:25265613

  2. Using Association Rule Mining for Extracting Product Sales Patterns in Retail Store Transactions

    Directory of Open Access Journals (Sweden)

    Pramod Prasad,

    2011-05-01

    Full Text Available Computers and software play an integral part in the working of businesses and organisations. An immense amount of data is generated with the use of software. These large datasets need to be analysed for useful information that would benefit organisations, businesses and individuals by supporting decision making and providing valuable knowledge. Data mining is an approach that aids in fulfilling this requirement. Data mining is the process of applying mathematical, statistical and machine learning techniques on large quantities of data (such as a data warehouse with the intention of uncovering hidden patterns, often previously unknown. Data mining involvesthree general approaches to extracting useful information from large data sets, namely, classification, clustering and association rule mining. This paper elaborates upon the use of association rule mining in extracting patterns that occur frequently within a dataset and showcases the implementation of the Apriori algorithm in mining association rules from a dataset containing sales transactions of a retail store.

  3. WEB-BASED DATA MINING TOOLS : PERFORMING FEEDBACK ANALYSIS AND ASSOCIATION RULE MINING

    Directory of Open Access Journals (Sweden)

    Pratiyush Guleria

    2015-11-01

    Full Text Available This paper aims to explain the web-enabled tools for educational data mining. The proposed web-based tool developed using Asp.Net framework and php can be helpful for universities or institutions providing the students with elective courses as well improving academic activities based on feedback collected from students. In Asp.Net tool, association rule mining using Apriori algorithm is used whereas in php based Feedback Analytical Tool, feedback related to faculty and institutional infrastructure is collected from students and based on that Feedback it shows performance of faculty and institution. Using that data, it helps management to improve in-house training skills and gains knowledge about educational trends which is to be followed by faculty to improve the effectiveness of the course and teaching skills.

  4. Reduction of Negative and Positive Association Rule Mining and Maintain Superiority of Rule Using Modified Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Nikhil Jain,Vishal Sharma,Mahesh Malviya

    2012-12-01

    Full Text Available Association rule mining play important rule inmarket data analysis and also in medical diagnosisof correlated problem. For the generation ofassociation rule mining various technique are usedsuch as Apriori algorithm, FP-growth and treebased algorithm. Some algorithms are wonderperformance but generate negative association ruleand also suffered from Superiority measureproblem. In this paper we proposed a multi-objectiveassociation rule mining based on genetic algorithmand Euclidean distance formula. In this method wefind the near distance of rule set using Euclideandistance formula and generate two class higherclass and lower class .the validate of class check bydistance weight vector. Basically distance weightvector maintain a threshold value of rule itemsets.In whole process we used genetic algorithm foroptimization of rule set. Here we set population sizeis 1000 and selection process validate by distanceweight vector. Our proposed algorithm distanceweight optimization of association rule mining withgenetic algorithm compared with multi-objectiveassociation rule optimization using geneticalgorithm. Our proposed algorithm is better rule setgeneration instead of MORA method.

  5. arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

    Directory of Open Access Journals (Sweden)

    Michael Hahsler

    2005-09-01

    Full Text Available Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.

  6. PARM--an efficient algorithm to mine association rules from spatial data.

    Science.gov (United States)

    Ding, Qin; Ding, Qiang; Perrizo, William

    2008-12-01

    Association rule mining, originally proposed for market basket data, has potential applications in many areas. Spatial data, such as remote sensed imagery (RSI) data, is one of the promising application areas. Extracting interesting patterns and rules from spatial data sets, composed of images and associated ground data, can be of importance in precision agriculture, resource discovery, and other areas. However, in most cases, the sizes of the spatial data sets are too large to be mined in a reasonable amount of time using existing algorithms. In this paper, we propose an efficient approach to derive association rules from spatial data using Peano Count Tree (P-tree) structure. P-tree structure provides a lossless and compressed representation of spatial data. Based on P-trees, an efficient association rule mining algorithm PARM with fast support calculation and significant pruning techniques is introduced to improve the efficiency of the rule mining process. The P-tree based Association Rule Mining (PARM) algorithm is implemented and compared with FP-growth and Apriori algorithms. Experimental results showed that our algorithm is superior for association rule mining on RSI spatial data. PMID:19022723

  7. Interestingness Measure for Mining Spatial Gene Expression Data using Association Rule

    CERN Document Server

    Anandhavalli, M; Gauthaman, K

    2010-01-01

    The search for interesting association rules is an important topic in knowledge discovery in spatial gene expression databases. The set of admissible rules for the selected support and confidence thresholds can easily be extracted by algorithms based on support and confidence, such as Apriori. However, they may produce a large number of rules, many of them are uninteresting. The challenge in association rule mining (ARM) essentially becomes one of determining which rules are the most interesting. Association rule interestingness measures are used to help select and rank association rule patterns. Besides support and confidence, there are other interestingness measures, which include generality reliability, peculiarity, novelty, surprisingness, utility, and applicability. In this paper, the application of the interesting measures entropy and variance for association pattern discovery from spatial gene expression data has been studied. In this study the fast mining algorithm has been used which produce candidat...

  8. An algorithm of spatial association rules mining used in mobile computing

    Science.gov (United States)

    Tu, Chengsheng

    2011-10-01

    In order to fast mine spatial association rules and improve efficiency of mobile intelligent system, this paper proposes an algorithm of alternative search spatial association rules mining. The algorithm firstly uses the way of spatial buffer analysis to extract spatial predicate values, and then uses spatial predicate value of every target location to form a spatial transaction and turns it into integer by binary coding, finally uses iteration method of alternative search to extract spatial association rules, namely, not only does it use iteration method of gaining (L-1)-subset of L-non frequent itemsets to generate candidate frequent itemsets, it also uses iteration method of gaining (K+1)-superset of K-frequent itemsets to generate candidate frequent itemsets. The result of simulate experiment indicates that the algorithm is faster and more efficient than present mining algorithms when mining spatial association rules in mobile computing.

  9. A Frequent Closed Itemsets Lattice-based Approach for Mining Minimal Non-Redundant Association Rules

    CERN Document Server

    Vo, Bay

    2011-01-01

    There are many algorithms developed for improvement the time of mining frequent itemsets (FI) or frequent closed itemsets (FCI). However, the algorithms which deal with the time of generating association rules were not put in deep research. In reality, in case of a database containing many FI/FCI (from ten thousands up to millions), the time of generating association rules is much larger than that of mining FI/FCI. Therefore, this paper presents an application of frequent closed itemsets lattice (FCIL) for mining minimal non-redundant association rules (MNAR) to reduce a lot of time for generating rules. Firstly, we use CHARM-L for building FCIL. After that, based on FCIL, an algorithm for fast generating MNAR will be proposed. Experimental results show that the proposed algorithm is much faster than frequent itemsets lattice-based algorithm in the mining time.

  10. Using an improved association rules mining optimization algorithm in web-based mobile-learning system

    Science.gov (United States)

    Huang, Yin; Chen, Jianhua; Xiong, Shaojun

    2009-07-01

    Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.

  11. Association Rule Mining for Both Frequent and Infrequent Items Using Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    MIR MD. JAHANGIR KABIR

    2014-07-01

    Full Text Available In data mining research, generating frequent items from large databases is one of the important issues and the key factor for implementing association rule mining tasks. Mining infrequent items such as relationships among rare but expensive products is another demanding issue which have been shown in some recent studies. Therefore this study considers user assigned threshold values as a constraint which helps users mine those rules which are more interesting for them. In addition, in real world users may prefer to know relationships among frequent items along with infrequent ones. The particle swarm optimization algorithm is an important heuristic technique in recent years and this study uses this technique to mine association rules effectively. If this technique considers user defined threshold values, interesting association rules can be generated more efficiently. Therefore this study proposes a novel approach which includes using particle swarm optimization algorithm to mine association rules from databases. Our implementation of the search strategy includes bitmap representation of nodes in a lexicographic tree and from superset-subset relationship of the nodes it classifies frequent items along with infrequent itemsets. In addition, this approach avoids extra calculation overhead for generating frequent pattern trees and handling large memory which store the support values of candidate item sets. Our experimental results show that this approach efficiently mines association rules. It accesses a database to calculate a support value for fewer numbers of nodes to find frequent itemsets and from that it generates association rules, which dramatically reduces search time. The main aim of this proposed algorithm is to show how heuristic method works on real databases to find all the interesting association rules in an efficient way.

  12. Causal association rule mining methods based on fuzzy state description

    Institute of Scientific and Technical Information of China (English)

    Liang Kaijian; Liang Quan; Yang Bingru

    2006-01-01

    Aiming at the research that using more new knowledge to develope knowledge system with dynamic accordance, and under the background of using Fuzzy language field and Fuzzy language values structure as description framework, the generalized cell Automation that can synthetically process fuzzy indeterminacy and random indeterminacy and generalized inductive logic causal model is brought forward. On this basis, a kind of the new method that can discover causal association rules is provded. According to the causal information of standard sample space and commonly sample space,through constructing its state (abnormality) relation matrix, causal association rules can be gained by using inductive reasoning mechanism. The estimate of this algorithm complexity is given,and its validity is proved through case.

  13. Rule pruning and prediction methods for associative classification approach in data mining

    OpenAIRE

    Abu Mansour, Hussein Y

    2012-01-01

    Recent studies in data mining revealed that Associative Classification (AC) data mining approach builds competitive classification classifiers with reference to accuracy when compared to classic classification approaches including decision tree and rule based. Nevertheless, AC algorithms suffer from a number of known defects as the generation of large number of rules which makes it hard for end-user to maintain and understand its outcome and the possible over-fitting issue caused by the confi...

  14. Text Mining Approaches To Extract Interesting Association Rules from Text Documents

    OpenAIRE

    Vishwadeepak Singh Baghela; S. P. Tripathi

    2012-01-01

    A handful of text data mining approaches are available to extract many potential information and association from large amount of text data. The term data mining is used for methods that analyze data with the objective of finding rules and patterns describing the characteristic properties of the data. The 'mined information is typically represented as a model of the semantic structure of the dataset, where the model may be used on new data for prediction or classification. In general, data mi...

  15. A Novel Approach For Discovery Multi Level Fuzzy Association Rule Mining

    CERN Document Server

    Gautam, Pratima

    2010-01-01

    Finding multilevel association rules in transaction databases is most commonly seen in is widely used in data mining. In this paper, we present a model of mining multilevel association rules which satisfies the different minimum support at each level, we have employed fuzzy set concepts, multi-level taxonomy and different minimum supports to find fuzzy multilevel association rules in a given transaction data set. Apriori property is used in model to prune the item sets. The proposed model adopts a topdown progressively deepening approach to derive large itemsets. This approach incorporates fuzzy boundaries instead of sharp boundary intervals. An example is also given to demonstrate and support that the proposed mining algorithm can derive the multiple-level association rules under different supports in a simple and effective manner.

  16. An algorithm about spatial association rule mining based on cell pattern

    Science.gov (United States)

    Chen, Jiangping; Li, Pingxiang; Fei, Huang; Wang, Rong

    2006-10-01

    Spatial association rule is one of the upmost knowledge rules in the result of spatial data mining. It emphasizes particularly on confirming the relation of data in different fields. It tries to find out the dependence of data in multi-fields. As we know, in GIS the spatial database is often separated into several layers or tables according the type of the spatial object such as road layer, building layer, plant layer etc. In the relational database we often separate it into several tables which be associated by the primary key and foreign key according the normal form theory. Consequently, the spatial data is stored in different layers and tables. It is necessary and meaning to mining the knowledge and rules in multi-layer and multi-tables. And, It is inevitable to mining spatial association rules in multi-layer in some application. There is a problem in it, that is the number of the rules are magnitude. So, we point a new way by using the cell pattern of the rules which the user interested to reduce and simplify the operation. In this paper the concept of multi-layer spatial association rule is put forward. Then an algorithm of mining multi-layer spatial association rule is presented which based on cell pattern and spatial concept relation. It was called AP-MLSAM in the paper. Last, an example in GIS is given. In AP-MLSAM, First, it confirms the patterns and rules that the user is interested in. Second it counts the large itemsets according with the cell pattern in each data layer. Last, the spatial association rules are gained by the itemsets which be counted in the second step. From the experiment, it proved that AP-MLSAM is effective. It improved the efficiency by reducing the time of finding the large itemsets. It is a significance research field for mining multi-layer spatial association rules. There are many applications based on multi-layer spatial association analyse. For example: traffic flux analyse in city, weather pattern analyse, trend analyse for

  17. Text Mining Approaches To Extract Interesting Association Rules from Text Documents

    Directory of Open Access Journals (Sweden)

    Vishwadeepak Singh Baghela

    2012-05-01

    Full Text Available A handful of text data mining approaches are available to extract many potential information and association from large amount of text data. The term data mining is used for methods that analyze data with the objective of finding rules and patterns describing the characteristic properties of the data. The 'mined information is typically represented as a model of the semantic structure of the dataset, where the model may be used on new data for prediction or classification. In general, data mining deals with structured data (for example relational databases, whereas text presents special characteristics and is unstructured. The unstructured data is totally different from databases, where mining techniques are usually applied and structured data is managed. Text mining can work with unstructured or semi-structured data sets A brief review of some recent researches related to mining associations from text documents is presented in this paper.

  18. a Novel Similarity Assessment for Remote Sensing Images via Fast Association Rule Mining

    Science.gov (United States)

    Liu, Jun; Chen, Kai; Liu, Ping; Qian, Jing; Chen, Huijuan

    2016-06-01

    Similarity assessment is the fundamentally important to various remote sensing applications such as image classification, image retrieval and so on. The objective of similarity assessment is to automatically distinguish differences between images and identify the contents of an image. Unlike the existing feature-based or object-based methods, we concern more about the deep level pattern of image content. The association rule mining is capable to find out the potential patterns of image, hence in this paper, a fast association rule mining algorithm is proposed and the similarity is represented by rules. More specifically, the proposed approach consist of the following steps: firstly, the gray level of image is compressed using linear segmentation to avoid interference of details and reduce the computation amount; then the compressed gray values between pixels are collected to generate the transaction sets which are transformed into the proposed multi-dimension data cube structure; the association rules are then fast mined based on multi-dimension data cube; finally the mined rules are represented as a vector and similarity assessment is achieved by vector comparison using first order approximation of Kullback-Leibler divergence. Experimental results indicate that the proposed fast association rule mining algorithm is more effective than the widely used Apriori method. The remote sensing image retrieval experiments using various images for example, QuickBird, WorldView-2, based on the existing and proposed similarity assessment show that the proposed method can provide higher retrieval precision.

  19. Inferring Intra-Community Microbial Interaction Patterns from Metagenomic Datasets Using Associative Rule Mining Techniques.

    Science.gov (United States)

    Tandon, Disha; Haque, Mohammed Monzoorul; Mande, Sharmila S

    2016-01-01

    The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM

  20. Inferring Intra-Community Microbial Interaction Patterns from Metagenomic Datasets Using Associative Rule Mining Techniques

    Science.gov (United States)

    Mande, Sharmila S.

    2016-01-01

    The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM

  1. A Study on Post mining of Association Rules Targeting User Interest

    Directory of Open Access Journals (Sweden)

    P. Sarala #1 , S. Jayaprada *2

    2012-10-01

    Full Text Available Association Rule Mining means discovering interesting patterns with in large databases. Association rules are used in many application areas such as market base analysis, web log analysis, protein substructures. Several post processing methods were developed to reduce the number of rules using nonredundant rules or pruning techniques such as pruning, summarizing, grouping or visualization based on statistical information in the database. As such, problem of identifying interest rules remind the same. Methods such as Rule deductive method, Stream Mill Miner (SMM, a DSMS (Data Stream Management Systems, Medoid clustering technique (PAM: Partitioning around medoids, Constraint-based Multi-level Association Rules with an ontology support were developed but are not effective. The number of rules generated by Apriori, FPgrowth depends on statistical measures such as support, confidence and may not suit the requirements of user. Methods that use ranking algorithm and IRF (Item Relatedness Filter have the drawbacks of using filters during pruning stage. The paper studies methods that were proposed for post processing of association rules and proposes a new method for extracting association rules based on user interest using MIRO (Mining Interest Rules Using Ontologies framework that uses correlation measures combined with domain ontology, succint constraints.

  2. DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR DISCOVERING ASSOCIATION RULES

    Directory of Open Access Journals (Sweden)

    Johannes K. Chiang

    2012-12-01

    Full Text Available Data Mining is one of the most significant tools for discovering association patterns that are useful for many knowledge domains. Yet, there are some drawbacks in existing mining techniques. Three main weaknesses of current data-mining techniques are: 1 re-scanning of the entire database must be done whenever new attributes are added. 2 An association rule may be true on a certain granularity but fail on a smaller ones and vise verse. 3 Current methods can only be used to find either frequent rules or infrequent rules, but not both at the same time. This research proposes a novel data schema and an algorithm that solves the above weaknesses while improving on the efficiency and effectiveness of data mining strategies. Crucial mechanisms in each step will be clarified in this paper. Finally, this paper presents experimental results regarding efficiency, scalability, information loss, etc. of the proposed approach to prove its advantages.

  3. Hybrid Medical Image Classification Using Association Rule Mining with Decision Tree Algorithm

    CERN Document Server

    Rajendran, P

    2010-01-01

    The main focus of image mining in the proposed method is concerned with the classification of brain tumor in the CT scan brain images. The major steps involved in the system are: pre-processing, feature extraction, association rule mining and hybrid classifier. The pre-processing step has been done using the median filtering process and edge features have been extracted using canny edge detection technique. The two image mining approaches with a hybrid manner have been proposed in this paper. The frequent patterns from the CT scan images are generated by frequent pattern tree (FP-Tree) algorithm that mines the association rules. The decision tree method has been used to classify the medical images for diagnosis. This system enhances the classification process to be more accurate. The hybrid method improves the efficiency of the proposed method than the traditional image mining methods. The experimental result on prediagnosed database of brain images showed 97% sensitivity and 95% accuracy respectively. The ph...

  4. Object-oriented spatial-temporal association rules mining on ocean remote sensing imagery

    International Nuclear Information System (INIS)

    Using the long term marine remote sensing imagery, we develop an object-oriented spatial-temporal association rules mining framework to explore the association rules mining among marine environmental elements. Within the framework, two key issues are addressed. They are how to effectively deal with the related lattices and how to reduce the related dimensions? To deal with the first key issues, this paper develops an object-oriented method for abstracting marine sensitive objects from raster pixels and for representing them with a quadruple. To deal with the second key issues, by embedding the mutual information theory, we construct the direct association pattern tree to reduce the related elements at the first step, and then the Apriori algorithm is used to discover the spatio-temporal associated rules. Finally, Pacific Ocean is taken as a research area and multi- marine remote sensing imagery in recent three decades is used as a case study. The results show that the object-oriented spatio-temporal association rules mining can acquire the associated relationships not only among marine environmental elements in same region, also among the different regions. In addition, the information from association rules mining is much more expressive and informative in space and time than traditional spatio-temporal analysis

  5. Gain ratio based fuzzy weighted association rule mining classifier for medical diagnostic interface

    Indian Academy of Sciences (India)

    N S Nithya; K Duraiswamy

    2014-02-01

    The health care environment still needs knowledge based discovery for handling wealth of data. Extraction of the potential causes of the diseases is the most important factor for medical data mining. Fuzzy association rule mining is wellperformed better than traditional classifiers but it suffers from the exponential growth of the rules produced. In the past, we have proposed an information gain based fuzzy association rule mining algorithm for extracting both association rules and membership functions of medical data to reduce the rules. It used a ranking based weight value to identify the potential attribute. When we take a large number of distinct values, the computation of information gain value is not feasible. In this paper, an enhanced approach, called gain ratio based fuzzy weighted association rule mining, is thus proposed for distinct diseases and also increase the learning time of the previous one. Experimental results show that there is a marginal improvement in the attribute selection process and also improvement in the classifier accuracy. The system has been implemented in Java platform and verified by using benchmark data from the UCI machine learning repository.

  6. Stellar spectra association rule mining method based on the weighted frequent pattern tree

    International Nuclear Information System (INIS)

    Effective extraction of data association rules can provide a reliable basis for classification of stellar spectra. The concept of stellar spectrum weighted itemsets and stellar spectrum weighted association rules are introduced, and the weight of a single property in the stellar spectrum is determined by information entropy. On that basis, a method is presented to mine the association rules of a stellar spectrum based on the weighted frequent pattern tree. Important properties of the spectral line are highlighted using this method. At the same time, the waveform of the whole spectrum is taken into account. The experimental results show that the data association rules of a stellar spectrum mined with this method are consistent with the main features of stellar spectral types. (research papers)

  7. Knowledge Discovery from Students’ Result Repository: Association Rule Mining Approach

    Directory of Open Access Journals (Sweden)

    Oladipupo O.O. & Oyelade O.J

    2010-06-01

    Full Text Available Over the years, several statistical tools have been used to analyze students’performance from different points of view. This paper presents data mining ineducation environment that identifies students’ failure patterns using associationrule mining technique. The identified patterns are analysed to offer a helpful andconstructive recommendations to the academic planners in higher institutions oflearning to enhance their decision making process. This will also aid in thecurriculum structure and modification in order to improve students’ academicperformance and trim down failure rate. The software for mining student failedcourses was developed and the analytical process was described.

  8. An Efficient Algorithm for Mining Multilevel Association Rule Based on Pincer Search

    Directory of Open Access Journals (Sweden)

    Pratima Gautam

    2012-07-01

    Full Text Available Discovering frequent itemset is a key difficulty in significant data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. The problem of developing models and algorithms for multilevel association mining poses for new challenges for mathematics and computer science. In this paper, we present a model of mining multilevel association rules which satisfies the different minimum support at each level, we have employed princer search concepts, multilevel taxonomy and different minimum supports to find multilevel association rules in a given transaction data set. This search is used only for maintaining and updating a new data structure. It is used to prune early candidates that would normally encounter in the top-down search. A main characteristic of the algorithms is that it does not require explicit examination of every frequent itemsets, an example is also given to demonstrate and support that the proposed mining algorithm can derive the multiple-level association rules under different supports in a simple and effective manner.

  9. A mining method for tracking changes in temporal association rules from an encoded database

    Directory of Open Access Journals (Sweden)

    Chelliah Balasubramanian

    2009-07-01

    Full Text Available Mining of association rules has become vital in organizations for decision making. The principle of data mining is better to use complicative primitive patterns and simple logical combination than simple primitive patterns and complex logical form. This paper overviews the concept of temporal database encoding, association rules mining. It proposes an innovative approach of data mining to reduce the size of the main database by an encoding method which in turn reduces the memory required. The use of the anti-Apriori algorithm reduces the number of scans over the database. The Apriori family of algorithms is applied on the encoded temporal database and their performances are compared. Also an important method on how to track the association rules that change with time is focused. This method involves initial decomposition of the problem. Later the changing association rules are tracked by dividing the time into smaller intervals and observing the changes in the itemsets obtained in each such interval. Thus the results obtained are lower complexities of computations involved, time and space with effective identification of changing association rules resulting in good decisions making. This helps in formalizing the database metrics in a better way.

  10. An Overview of Secure Mining of Association Rules in Horizontally Distributed Databases

    OpenAIRE

    Ms. Sonal Patil; Harshad S. Patil

    2015-01-01

    In this paper, propose a protocol for secure mining of association rules in horizontally distributed databases. Now a day the current leading protocol is Kantarcioglu and Clifton. This protocol is based on the Fast Distributed Mining (FDM) algorithm which is an unsecured distributed version of the Apriori algorithm. The main ingredients in this protocol are two novel secure multi-party algorithms 1. That computes the union of private subsets that each of the interacting players hold, and 2. T...

  11. Image Mining for Mammogram Classification by Association Rule Using Statistical and GLCM features

    Directory of Open Access Journals (Sweden)

    Aswini Kumar Mohanty

    2011-09-01

    Full Text Available The image mining technique deals with the extraction of implicit knowledge and image with data relationship or other patterns not explicitly stored in the images. It is an extension of data mining to image domain. The main objective of this paper is to apply image mining in the domain such as breast mammograms to classify and detect the cancerous tissue. Mammogram image can be classified into normal, benign and malignant class and to explore the feasibility of data mining approach. A new association rule algorithm is proposed in this paper. Experimental results show that this new method can quickly discover frequent item sets and effectively mine potential association rules. A total of 26 features including histogram intensity features and GLCM features are extracted from mammogram images. A new approach of feature selection is proposed which approximately reduces 60% of the features and association rule using image content is used for classification. The most interesting one is that oscillating search algorithm which is used for feature selection provides the best optimal features and no where it is applied or used for GLCM feature selection from mammogram. Experiments have been taken for a data set of 300 images taken from MIAS of different types with the aim of improving the accuracy by generating minimum no. of rules to cover more patterns. The accuracy obtained by this method is approximately 97.7% which is highly encouraging.

  12. Spatio-Temporal Rule Mining

    DEFF Research Database (Denmark)

    Gidofalvi, Gyozo; Pedersen, Torben Bach

    2005-01-01

    Recent advances in communication and information technology, such as the increasing accuracy of GPS technology and the miniaturization of wireless communication devices pave the road for Location-Based Services (LBS). To achieve high quality for such services, spatio-temporal data mining techniques...... are needed. In this paper, we describe experiences with spatio-temporal rule mining in a Danish data mining company. First, a number of real world spatio-temporal data sets are described, leading to a taxonomy of spatio-temporal data. Second, the paper describes a general methodology that transforms...... the spatio-temporal rule mining task to the traditional market basket analysis task and applies it to the described data sets, enabling traditional association rule mining methods to discover spatio-temporal rules for LBS. Finally, unique issues in spatio-temporal rule mining are identified and...

  13. A Genetic Algorithm Based Multilevel Association Rules Mining for Big Datasets

    Directory of Open Access Journals (Sweden)

    Yang Xu

    2014-01-01

    Full Text Available Multilevel association rules mining is an important domain to discover interesting relations between data elements with multiple levels abstractions. Most of the existing algorithms toward this issue are based on exhausting search methods such as Apriori, and FP-growth. However, when they are applied in the big data applications, those methods will suffer for extreme computational cost in searching association rules. To expedite multilevel association rules searching and avoid the excessive computation, in this paper, we proposed a novel genetic-based method with three key innovations. First, we use the category tree to describe the multilevel application data sets as the domain knowledge. Then, we put forward a special tree encoding schema based on the category tree to build the heuristic multilevel association mining algorithm. As the last part of our design, we proposed the genetic algorithm based on the tree encoding schema that will greatly reduce the association rule search space. The method is especially useful in mining multilevel association rules in big data related applications. We test the proposed method with some big datasets, and the experimental results demonstrate the effectiveness and efficiency of the proposed method in processing big data. Moreover, our results also manifest that the algorithm is fast convergent with a limited termination threshold.

  14. Identification of the Patterns Behavior Consumptions by Using Chosen Tools of Data Mining - Association Rules

    Directory of Open Access Journals (Sweden)

    R. Benda Prokeinová

    2014-09-01

    Full Text Available The research and development in sustainable environment, that is a subject of research goal of many various countries and food producers, now, it has a long tradition. The research aim of this paper allows for an identification of the patterns behaviour consumptions by using of association rules, because of knowledge ́s importance of segmentation differences between consumers and their opinions on current sustainable tendencies. The research area of sustainability will be in Slovakia still discussed, primarily because of impacts and consumer ́s influencing to product ́s buying, that are safety to environment and to nature. We emphasize an importance of sustainability in consumer behaviour and we detailed focused on segmentation differences between respondents. We addressed a sample made by 318 respondents. The article aims identifying sustainable consumer behaviour by using chosen data mining tool - association rules. The area of knowledge-based systems is widely overlaps with the techniques in data mining. Mining in the data is in fact devoted to the process of acquiring knowledge from large amounts of data. Its techniques and approaches are useful only when more focused external systems as well as more general systems to work with knowledge. One of the challenges of knowledge-based systems is to derive new knowledge on the basis of known facts and knowledge. This function in a sense meets methods using association rules. Association rules as a technique in data mining is useful in various applications such as analysis of the shopping cart, discovering hidden dependencies entries or recommendation. After an introduction and explanation of the principle of sustainability in consumption, association rules, follows description of the algorithm for obtaining rules from transaction data. Then will present the practical application of the data obtained by questionnaire survey. Calculations are performed in the free data mining software Tanagra.

  15. TDSGenerator-A Tool for generating synthetic Transactional Datasets for Association Rules Mining

    Directory of Open Access Journals (Sweden)

    G S Bhamra

    2011-03-01

    Full Text Available Data Mining (DM is the process of automated extraction of interesting data patterns representing knowledge, from the large data sets. Frequent itemsets are the item sets that appear in a data set frequently. Finding such frequent itemsets plays an essential role in mining associations, correlations, and many other interesting relationships among itemsets in transactional and relational database. In this paper we have presented a tool called, Transactional Dataset Generator (TDSGenerator v1.0 for generating a Binary Dataset as well as Transactional Dataset corresponding to the Binary Dataset. Synthetic datasets generated by this tool will be used to find the list of frequent itemsets and thereafter finding the strong Association Rules among those itemsets. This tool can also be used as a demonstrator for experimenting and explaining the concepts of Association Rules Mining (ARM.

  16. The Books Recommend Service System Based on Improved Algorithm for Mining Association Rules

    Institute of Scientific and Technical Information of China (English)

    王萍

    2009-01-01

    The Apriori algorithm is a classical method of association rules mining. Based on analysis of this theory, the paper provides an improved Apriori algorithm. The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to en-hance the usage efficiency of resources as well as the individualized service of the data library.

  17. A Personalized Collaborative Filtering Recommendation Using Association Rules Mining and Self-Organizing Map

    Directory of Open Access Journals (Sweden)

    Hongwu Ye

    2011-04-01

    Full Text Available With the development of the Internet, the problem of information overload is becoming increasing serious. People all have experienced the feeling of being overwhelmed by the number of new books, articles, and proceedings coming out each year. Many researchers pay more attention on building a proper tool which can help users obtain personalized resources. Personalized recommendation systems are one such software tool used to help users obtain recommendations for unseen items based on their preferences. The commonly used personalized recommendation system methods are content-based filtering, collaborative filtering, and association rules mining. Unfortunately, each method has its drawbacks. This paper presented a personalized collaborative filtering recommendation method combining the association rules mining and self-organizing map. It used the association rules mining to fill the vacant where necessary. Then, it employs clustering function of self-organizing map to form nearest neighbors of the target item and it produces prediction of the target user to the target item using item-based collaborative filtering. The recommendation method combining association rules mining and collaborative filtering can alleviate the data sparsity problem in the recommender systems.

  18. A Review of Protein-DNA Binding Motif using Association Rule Mining

    Directory of Open Access Journals (Sweden)

    Virendra Kumar Tripathi,

    2013-04-01

    Full Text Available Thesurvival of gene regulation and lifemechanisms is pre-request of finding unknownpattern oftranscription factor binding sites. Thediscovery motif of gene regulation inbioinformaticsis challenging jobs for getting relation betweentranscription factors and transcription factorbinding sites. The increasing size and length ofstring pattern of motif is issued a problem related tomodeling and optimization of gene selectionprocess. In this paper we give a survey of protein-DNA binding using association rule mining.Association rule mining well knowndata miningtechnique for pattern analysis. The capability ofnegative and positive pattern generation help fullfordiscoveringof new pattern in DNA bindingbioinformatics data. The other data miningapproach such as clustering and classification alsoapplied the process of gene selection grouping forknown and unknown pattern. But faced a problemof valid string of DNA data, the rule miningprinciple find a better relation between transcriptionfactors and transcription factor binding sites.

  19. A Review of Protein-DNA Binding Motif using Association Rule Mining

    Directory of Open Access Journals (Sweden)

    Virendra Kumar Tripathi

    2013-03-01

    Full Text Available The survival of gene regulation and life mechanisms is pre-request of finding unknown pattern of transcription factor binding sites. The discovery motif of gene regulation in bioinformatics is challenging jobs for getting relation between transcription factors and transcription factor binding sites. The increasing size and length of string pattern of motif is issued a problem related to modeling and optimization of gene selection process. In this paper we give a survey of protein-DNA binding using association rule mining. Association rule mining well known data mining technique for pattern analysis. The capability of negative and positive pattern generation help full for discovering of new pattern in DNA binding bioinformatics data. The other data mining approach such as clustering and classification also applied the process of gene selection grouping for known and unknown pattern. But faced a problem of valid string of DNA data, the rule mining principle find a better relation between transcription factors and transcription factor binding sites.

  20. A Novel Approach for Discovery Quantitative Fuzzy Multi-Level Association Rules Mining Using Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Saad M. Darwish

    2016-10-01

    Full Text Available Quantitative multilevel association rules mining is a central field to realize motivating associations among data components with multiple levels abstractions. The problem of expanding procedures to handle quantitative data has been attracting the attention of many researchers. The algorithms regularly discretize the attribute fields into sharp intervals, and then implement uncomplicated algorithms established for Boolean attributes. Fuzzy association rules mining approaches are intended to defeat such shortcomings based on the fuzzy set theory. Furthermore, most of the current algorithms in the direction of this topic are based on very tiring search methods to govern the ideal support and confidence thresholds that agonize from risky computational cost in searching association rules. To accelerate quantitative multilevel association rules searching and escape the extreme computation, in this paper, we propose a new genetic-based method with significant innovation to determine threshold values for frequent item sets. In this approach, a sophisticated coding method is settled, and the qualified confidence is employed as the fitness function. With the genetic algorithm, a comprehensive search can be achieved and system automation is applied, because our model does not need the user-specified threshold of minimum support. Experiment results indicate that the recommended algorithm can powerfully generate non-redundant fuzzy multilevel association rules.

  1. A LFP-tree based method for association rules mining in telecommunication alarm correlation analysis

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The mining of association rules is one of the primary methods used in telecommunication alarm correlation analysis,of which the alarm databases are very large.The efficiency of the algorithms plays an important role in tackling with large datasets. The classical frequent pattern growth(FP-growth) algorithm can produce a large number of conditional pattern trees which made it difficult to mine association rules in are telecommunication environment.In this paper,an algorithm based on layered frequent pattern tree(LFP-tree) is proposed for mining frequent patterns. Efficiency of this alagorithm is achieved with following techniques:1) All the frequent patterns are condensed into a layered structure,which can save memory time but also be very useful for updating the alarm databases.2) Each alarm item can be viewed as a triple,in which t is a Boolean vaviable that shows the item frequent or not.3) Deleting infrequent items with dynamic pruning can avoid produce conditional pattern sets. Simulation and analysis of algorithm show that it is a valid method with better time and space efficiency,which is adapted to mine association rules in telecommunication alarm correlation analysis.

  2. Mining Association Rules to Evade Network Intrusion in Network Audit Data

    Directory of Open Access Journals (Sweden)

    Kamini Nalavade

    2014-06-01

    Full Text Available With the growth of hacking and exploiting tools and invention of new ways of intrusion, intrusion detection and prevention is becoming the major challenge in the world of network security. The increasing network traffic and data on Internet is making this task more demanding. There are various approaches being utilized in intrusion detections, but unfortunately any of the systems so far is not completely flawless. The false positive rates make it extremely hard to analyse and react to attacks. Intrusion detection systems using data mining approaches make it possible to search patterns and rules in large amount of audit data. In this paper, we represent a model to integrate association rules to intrusion detection to design and implement a network intrusion detection system. Our technique is used to generate attack rules that will detect the attacks in network audit data using anomaly detection. This shows that the modified association rules algorithm is capable of detecting network intrusions. The KDD dataset which is freely available online is used for our experimentation and results are compared. Our intrusion detection system using association rule mining is able to generate attack rules that will detect the attacks in network audit data using anomaly detection, while maintaining a low false positive rate.

  3. A pragmatic approach on association rule mining and its effective utilization in large databases

    Directory of Open Access Journals (Sweden)

    Biswaranjan Nayak

    2012-05-01

    Full Text Available This paper deals with the effective utilization of association rule mining algorithms in large databases used for especially business organizations where the amount of transactions and items plays a crucial role for decision making. Frequent item-set generation and the creation of strong association rules from the frequent item-set patterns are the two basic steps in association rule mining. We have taken suitable illustration of market basket data for generating different item-set frequent patterns and association rule generation through this frequent pattern by the help of Apriori Algorithm and taken the same illustration for FP-Growth association rule mining and a FP-Growth Tree has been constructed for frequent item-set generation and from that strong association rules have been created. For performance study of Apriori and FP-Tree algorithms, experiments have been performed. The customer purchase behaviour i.e. seen in the food outlet environments is mimicked in these transactions. By using the synthetic data generation process, the observations has been plotted in the graphs by taking minimum support count with respect to execution time. From the graphs it has that as the minimum support values decrease, the execution times algorithms increase exponentially which is happened due to decrease in the minimum support threshold values make the number of item-sets in the output to be exponentially increased. It has been established from the graphs that the performance of FP-Growth is better than Apriori algorithm for all problem sizes with factor 2 for high minimum support values to very low level support magnitude.

  4. Effect of Temporal Relationships in Associative Rule Mining for Web Log Data

    OpenAIRE

    2014-01-01

    The advent of web-based applications and services has created such diverse and voluminous web log data stored in web servers, proxy servers, client machines, or organizational databases. This paper attempts to investigate the effect of temporal attribute in relational rule mining for web log data. We incorporated the characteristics of time in the rule mining process and analysed the effect of various temporal parameters. The rules generated from temporal relational rule mining are then compa...

  5. Comparative Study of Improved Association Rules Mining Based On Shopping System

    Directory of Open Access Journals (Sweden)

    Tang Zhi-hang

    2016-01-01

    Full Text Available Data mining is a process of discovering fascinating designs, new instructions and information from large amount of sales facts in transactional and interpersonal catalogs. The main purpose of this function is to find frequent patterns, associations and relationship between various database using different Algorithms. Association rule mining (ARM is used to improve decisions making in the applications. ARM became essential in an information and decision-overloaded world. They changed the way users make decisions, and helped their creators to increase revenue at the same time. Bringing ARM to a broader audience is essential in order to popularize them beyond the limits of scientific research and high technology entrepreneurship. It will be able to expand and apply effective marketing strategies and in disease identification frequent patterns are generated to discover the frequently occur diseases in a definite area. The conclusion in all applications is some kind of association rules (AR that are useful for efficient decision making.

  6. Design a Weight Based sorting distortion algorithm using Association rule Hiding for Privacy Preserving Data mining

    Directory of Open Access Journals (Sweden)

    R.Sugumar

    2011-12-01

    Full Text Available The security of the large database that contains certain crucial information, it will become a serious issue when sharing data to the network against unauthorized access. Privacy preserving data mining is a new research trend in privacy data for data mining and statistical database. Association analysis is a powerful tool for discovering relationships which are hidden in large database. Association rules hiding algorithms get strong an efficient performance for protecting confidential and crucial data. Data modification and rule hiding is one of the most important approaches for secure data. The objective of the proposed Weight Based Sorting Distortion (WBSD algorithm is to distort certain data which satisfies a particular sensitive rule. Then hide those transactions which support a sensitive rule and assigns them a priority and sorts them in ascending order according to the priority value of each rule. Then it uses these weights to compute the priority value for each transaction according to how weak the rule is that a transaction supports. Data distortion is one of the important methods to avoid this kind of scalability issues

  7. Mining Positive and Negative Association Rules: An Approach for Binary Trees

    Directory of Open Access Journals (Sweden)

    John Tsiligaridis

    2013-01-01

    Full Text Available Mining association rules and especially the negativeones has received a lot of attention and has been proved to beuseful in the real world. In this work, a set of algorithms forfinding both positive and negative association rules (NAR indatabases is presented. A variant of the Apriori, traditionalassociation rules algorithm, is achieved using support andconfidence in order to discover two types of NAR; the confinednegative association rules (CNR, and the generalized negativeassociation rules (GNAR. For the CNR, where only one negativerule exists among positive ones, the negative rule can bediscovered by applying the measure of correlation in terms ofthe conditional and marginal probability along with thecontingency tables. This measure is also used for finding positiverules in the case of branches of itemsets. The negativeassociations of CNR can be used for substitution of items inmarket basket analysis. A method of Binary Tree RulesConstruction (BTRC has been developed for the discovery ofrules that belong to GNAR , when one or more negative rulesalong with positive ones exist. In each computation process fromdisjoint sets, the BTRC produces nested subtrees in order to findthe NAR. BTRC is based on successive partitioning of the eventsof observing a sequence with a certain number of positive andnegative items. A set of formulas depending on the height of thetree has been developed. The process can be divided into twoparts; the external and the internal subtree process. For thediscovery of both types of rules an algorithm (BTA is developedbased on a traditional method and the BTRC.

  8. Recommending new items to customers : A comparison between Collaborative Filtering and Association Rule Mining

    OpenAIRE

    Sohlberg, Henrik

    2015-01-01

    E-commerce is an ever growing industry as the internet infrastructure continues to evolve. The benefits from a recommendation system to any online retail store are several. It can help customers to find what they need as well as increase sales by enabling accurate targeted promotions. Among many techniques that can form recommendation systems, this thesis compares Collaborative Filtering against Association Rule Mining, both implemented in combination with clustering. The suggested implementa...

  9. An Association Rule Mining-Based Framework for Understanding Lifestyle Risk Behaviors

    OpenAIRE

    Park, So Hyun; Jang, Shin Yi; Kim, Ho; Lee, Seung Wook

    2014-01-01

    Objectives This study investigated the prevalence and patterns of lifestyle risk behaviors in Korean adults. Methods We utilized data from the Fourth Korea National Health and Nutrition Examination Survey for 14,833 adults (>20 years of age). We used association rule mining to analyze patterns of lifestyle risk behaviors by characterizing non-adherence to public health recommendations related to the Alameda 7 health behaviors. The study variables were current smoking, heavy drinking, physical...

  10. Towards a Text Mining Methodology Using Frequent Itemsets and Association Rule Extraction

    OpenAIRE

    Cherfi, Hacène; Napoli, Amedeo; Toussaint, Yannick

    2003-01-01

    This paper proposes a methodology for text mining relying on the classical knowledge discovery loop, with a number of adaptations. First, texts are indexed and prepared to be processed by frequent itemset levelwise search. Association rules are then extracted and interpreted, with respect to a set of quality measures and domain knowledge, under the control of an analyst. The article includes an experimentation on a real-world text corpus holding on molecular biology.

  11. Efficient Mining of Association Rules by Reducing the Number of Passes over the Database

    Institute of Scientific and Technical Information of China (English)

    李庆忠; 王海洋; 闫中敏; 马绍汉

    2001-01-01

    This paper introduces a new algorithm of mining association rules.The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. The total number of passes over the database is only (k+2m-2)/m, where k is the longest size in the itemsets. It is much less than k.

  12. Performance Evaluation of Sequential and Parallel Mining of Association Rules using Apriori Algorithms

    Directory of Open Access Journals (Sweden)

    Puttegowda D

    2010-07-01

    Full Text Available The information age has seen most of the activities generating huge volumes of data. The explosive growth of business, scientific and government databases sizes has far outpaced our ability to interpret and digest the stored data. This has created a need for new generation tools and techniques for automated and intelligent database analysis. These tools and techniques are the subjects of the rapidly emerging field of data mining. One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. The most time consuming operation in this discovery process is the computation of the frequency of the occurrences of interesting subset of items (called candidates in the database of transactions. To prune the exponentially large space of candidates, most existing algorithms consider only those candidates that have a user defined minimum support. Even with the pruning, the task of finding all association rules requires a lot of computation power and memory. Parallel computers offer a potential solution to the computation requirement of this task, provided efficient and scalable parallel algorithms can be designed. In this paper, we have implemented Sequential and Parallel mining of Association Rules using Apriori algorithms and evaluated the performance of both algorithms.

  13. A novel biclustering approach to association rule mining for predicting HIV-1-human protein interactions.

    Directory of Open Access Journals (Sweden)

    Anirban Mukhopadhyay

    Full Text Available Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1-human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1-human interaction network. Novel HIV-1-human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed.

  14. Association rule mining based study for identification of clinical parameters akin to occurrence of brain tumor.

    Science.gov (United States)

    Sengupta, Dipankar; Sood, Meemansa; Vijayvargia, Poorvika; Hota, Sunil; Naik, Pradeep K

    2013-01-01

    Healthcare sector is generating a large amount of information corresponding to diagnosis, disease identification and treatment of an individual. Mining knowledge and providing scientific decision-making for the diagnosis & treatment of disease from the clinical dataset is therefore increasingly becoming necessary. Aim of this study was to assess the applicability of knowledge discovery in brain tumor data warehouse, applying data mining techniques for investigation of clinical parameters that can be associated with occurrence of brain tumor. In this study, a brain tumor warehouse was developed comprising of clinical data for 550 patients. Apriori association rule algorithm was applied to discover associative rules among the clinical parameters. The rules discovered in the study suggests - high values of Creatinine, Blood Urea Nitrogen (BUN), SGOT & SGPT to be directly associated with tumor occurrence for patients in the primary stage with atleast 85% confidence and more than 50% support. A normalized regression model is proposed based on these parameters along with Haemoglobin content, Alkaline Phosphatase and Serum Bilirubin for prediction of occurrence of STATE (brain tumor) as 0 (absent) or 1 (present). The results indicate that the methodology followed will be of good value for the diagnostic procedure of brain tumor, especially when large data volumes are involved and screening based on discovered parameters would allow clinicians to detect tumors at an early stage of development. PMID:23888095

  15. Effect of temporal relationships in associative rule mining for web log data.

    Science.gov (United States)

    Khairudin, Nazli Mohd; Mustapha, Aida; Ahmad, Mohd Hanif

    2014-01-01

    The advent of web-based applications and services has created such diverse and voluminous web log data stored in web servers, proxy servers, client machines, or organizational databases. This paper attempts to investigate the effect of temporal attribute in relational rule mining for web log data. We incorporated the characteristics of time in the rule mining process and analysed the effect of various temporal parameters. The rules generated from temporal relational rule mining are then compared against the rules generated from the classical rule mining approach such as the Apriori and FP-Growth algorithms. The results showed that by incorporating the temporal attribute via time, the number of rules generated is subsequently smaller but is comparable in terms of quality. PMID:24587757

  16. Association Rules in Data Mining: An Application on a Clothing and Accessory Specialty Store

    Directory of Open Access Journals (Sweden)

    Mutlu Yüksel Avcilar

    2014-04-01

    Full Text Available Retailers provide important functions that increase the value of the products and services they sell to consumers. Retailers value creating functions are providing assortment of products and services: breaking bulk, holding inventory, and providing services. For a long time, retail store managers have been interested in learning about within and cross-category purchase behavior of their customers, since valuable insights for designing marketing and/or targeted cross-selling programs can be derived. Especially, parallel to the development of information processing and communication technologies, it has become possible to transfer customers shopping information into databases with the help of barcode technology. Data mining is the technique presenting significant and useful information using of lots of data. Association rule mining is realized by using market basket analysis to discover relationships among items purchased by customers in transaction databases. In this study, association rules were estimated by using market basket analysis and taking support, confidence and lift measures into consideration. In the process of analysis, by using of data belonging to the year of 2012 from a clothing and accessory specialty store operating in the province of Osmaniye, a set of data related to 42.390 sales transactions including 9.000 different product kinds in 35 different product categories (SKU were used. Analyses were carried out with the help of SPSS Clementine packet program and hence 25.470 rules were determined.

  17. Association rule mining on grid monitoring data to detect error sources

    International Nuclear Information System (INIS)

    Error handling is a crucial task in an infrastructure as complex as a grid. There are several monitoring tools put in place, which report failing grid jobs including exit codes. However, the exit codes do not always denote the actual fault, which caused the job failure. Human time and knowledge is required to manually trace back errors to the real fault underlying an error. We perform association rule mining on grid job monitoring data to automatically retrieve knowledge about the grid components' behavior by taking dependencies between grid job characteristics into account. Therewith, problematic grid components are located automatically and this information - expressed by association rules - is visualized in a web interface. This work achieves a decrease in time for fault recovery and yields an improvement of a grid's reliability.

  18. Association rule mining on grid monitoring data to detect error sources

    CERN Document Server

    Maier, G; Kranzlmueller, D; Gaidioz, B

    2010-01-01

    Error handling is a crucial task in an infrastructure as complex as a grid. There are several monitoring tools put in place, which report failing grid jobs including exit codes. However, the exit codes do not always denote the actual fault, which caused the job failure. Human time and knowledge is required to manually trace back errors to the real fault underlying an error. We perform association rule mining on grid job monitoring data to automatically retrieve knowledge about the grid components' behavior by taking dependencies between grid job characteristics into account. Therewith, problematic grid components are located automatically and this information – expressed by association rules – is visualized in a web interface. This work achieves a decrease in time for fault recovery and yields an improvement of a grid's reliability

  19. Association rule mining on grid monitoring data to detect error sources

    Science.gov (United States)

    Maier, Gerhild; Schiffers, Michael; Kranzlmueller, Dieter; Gaidioz, Benjamin

    2010-04-01

    Error handling is a crucial task in an infrastructure as complex as a grid. There are several monitoring tools put in place, which report failing grid jobs including exit codes. However, the exit codes do not always denote the actual fault, which caused the job failure. Human time and knowledge is required to manually trace back errors to the real fault underlying an error. We perform association rule mining on grid job monitoring data to automatically retrieve knowledge about the grid components' behavior by taking dependencies between grid job characteristics into account. Therewith, problematic grid components are located automatically and this information - expressed by association rules - is visualized in a web interface. This work achieves a decrease in time for fault recovery and yields an improvement of a grid's reliability.

  20. Discovering protein-DNA binding sequence patterns using association rule mining.

    Science.gov (United States)

    Leung, Kwong-Sak; Wong, Ka-Chun; Chan, Tak-Ming; Wong, Man-Hon; Lee, Kin-Hong; Lau, Chi-Kong; Tsui, Stephen K W

    2010-10-01

    Protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play an essential role in transcriptional regulation. Over the past decades, significant efforts have been made to study the principles for protein-DNA bindings. However, it is considered that there are no simple one-to-one rules between amino acids and nucleotides. Many methods impose complicated features beyond sequence patterns. Protein-DNA bindings are formed from associated amino acid and nucleotide sequence pairs, which determine many functional characteristics. Therefore, it is desirable to investigate associated sequence patterns between TFs and TFBSs. With increasing computational power, availability of massive experimental databases on DNA and proteins, and mature data mining techniques, we propose a framework to discover associated TF-TFBS binding sequence patterns in the most explicit and interpretable form from TRANSFAC. The framework is based on association rule mining with Apriori algorithm. The patterns found are evaluated by quantitative measurements at several levels on TRANSFAC. With further independent verifications from literatures, Protein Data Bank and homology modeling, there are strong evidences that the patterns discovered reveal real TF-TFBS bindings across different TFs and TFBSs, which can drive for further knowledge to better understand TF-TFBS bindings. PMID:20529874

  1. An Overview of Secure Mining of Association Rules in Horizontally Distributed Databases

    Directory of Open Access Journals (Sweden)

    Sonal Patil

    2015-10-01

    Full Text Available In this paper, propose a protocol for secure mining of association rules in horizontally distributed databases. Now a day the current leading protocol is Kantarcioglu and Clifton. This protocol is based on the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main ingredients in this protocol are two novel secure multi-party algorithms 1. That computes the union of private subsets that each of the interacting players hold, and 2. Tests the inclusion of an element held by one player in a subset held by another. In this protocol offers enhanced privacy with respect to the other one. Differences in this protocol, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost [1].

  2. An Overview of Secure Mining of Association Rules in Horizontally Distributed Databases

    Directory of Open Access Journals (Sweden)

    Ms. Sonal Patil

    2014-01-01

    Full Text Available In this paper, propose a protocol for secure mining of association rules in horizontally distributed databases. Now a day the current leading protocol is Kantarcioglu and Clifton. This protocol is based on the Fast Distributed Mining (FDM algorithm which is an unsecured distributed version of the Apriori algorithm. The main ingredients in this protocol are two novel secure multi-party algorithms 1. That computes the union of private subsets that each of the interacting players hold, and 2. Tests the inclusion of an element held by one player in a subset held by another. In this protocol offers enhanced privacy with respect to the other one. Differences in this protocol, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost [1].

  3. A Business Intelligence Model to Predict Bankruptcy using Financial Domain Ontology with Association Rule Mining Algorithm

    Directory of Open Access Journals (Sweden)

    A Martin

    2011-05-01

    Full Text Available Today in every organization financial analysis provides the basis for understanding and evaluating the results of business operations and delivering how well a business is doing. This means that the organizations can control the operational activities primarily related to corporate finance. One way that doing this is by analysis of bankruptcy prediction. This paper develops an ontological model from financial information of an organization by analyzing the Semantics of the financial statement of a business. One of the best bankruptcy prediction models is Altman Z-score model. Altman Z-score method uses financial rations to predict bankruptcy. From the financial ontological model the relation between financial data is discovered by using data mining algorithm. By combining financial domain ontological model with association rule mining algorithm and Z-score model a new business intelligence model is developed to predict the bankruptcy.

  4. A Business Intelligence Model to Predict Bankruptcy using Financial Domain Ontology with Association Rule Mining Algorithm

    CERN Document Server

    Martin, A; Venkatesan, Dr V Prasanna

    2011-01-01

    Today in every organization financial analysis provides the basis for understanding and evaluating the results of business operations and delivering how well a business is doing. This means that the organizations can control the operational activities primarily related to corporate finance. One way that doing this is by analysis of bankruptcy prediction. This paper develops an ontological model from financial information of an organization by analyzing the Semantics of the financial statement of a business. One of the best bankruptcy prediction models is Altman Z-score model. Altman Z-score method uses financial rations to predict bankruptcy. From the financial ontological model the relation between financial data is discovered by using data mining algorithm. By combining financial domain ontological model with association rule mining algorithm and Zscore model a new business intelligence model is developed to predict the bankruptcy.

  5. icuARM-An ICU Clinical Decision Support System Using Association Rule Mining

    Science.gov (United States)

    Chanani, Nikhil; Venugopalan, Janani; Maher, Kevin; Wang, May Dongmei

    2013-01-01

    The rapid development of biomedical monitoring technologies has enabled modern intensive care units (ICUs) to gather vast amounts of multimodal measurement data about their patients. However, processing large volumes of complex data in real-time has become a big challenge. Together with ICU physicians, we have designed and developed an ICU clinical decision support system icuARM based on associate rule mining (ARM), and a publicly available research database MIMIC-II (Multi-parameter Intelligent Monitoring in Intensive Care II) that contains more than 40,000 ICU records for 30,000+patients. icuARM is constructed with multiple association rules and an easy-to-use graphical user interface (GUI) for care providers to perform real-time data and information mining in the ICU setting. To validate icuARM, we have investigated the associations between patients' conditions such as comorbidities, demographics, and medications and their ICU outcomes such as ICU length of stay. Coagulopathy surfaced as the most dangerous co-morbidity that leads to the highest possibility (54.1%) of prolonged ICU stay. In addition, women who are older than 50 years have the highest possibility (38.8%) of prolonged ICU stay. For clinical conditions treatable with multiple drugs, icuARM suggests that medication choice can be optimized based on patient-specific characteristics. Overall, icuARM can provide valuable insights for ICU physicians to tailor a patient's treatment based on his or her clinical status in real time.

  6. Mining association rules between abnormal health examination results and outpatient medical records.

    Science.gov (United States)

    Chao Huang, Yi

    2013-01-01

    Currently, interpretation of health examination reports relies primarily on the physician's own experience. If health screening data could be integrated with outpatient medical records to uncover correlations between disease and abnormal test results, the physician could benefit from having additional reference resources for medical examination report interpretation and clinic diagnosis. This study used the medical database of a regional hospital in Taiwan to illustrate how association rules can be found between abnormal health examination results and outpatient illnesses. The rules can help to build up a disease-prevention knowledge database that assists healthcare providers in follow-up treatment and prevention. Furthermore, this study proposes a new algorithm, the data cutting and sorting method, or DCSM, in place of the traditional Apriori algorithm. DCSM significantly improves the mining performance of Apriori by reducing the time to scan health examination and outpatient medical records, both of which are databases of immense sizes. PMID:23736654

  7. Identifying the Combinatorial Effects of Histone Modifications by Association Rule Mining in Yeast

    Directory of Open Access Journals (Sweden)

    Caisheng He

    2010-09-01

    Full Text Available Eukaryotic genomes are packaged into chromatin by histone proteins whose chemical modification can profoundly influence gene expression. The histone modifications often act in combinations, which exert different effects on gene expression. Although a number of experimental techniques and data analysis methods have been developed to study histone modifications, it is still very difficult to identify the relationships among histone modifications on a genome-wide scale. We proposed a method to identify the combinatorial effects of histone modifications by association rule mining. The method first identified Functional Modification Transactions (FMTs and then employed association rule mining algorithm and statistics methods to identify histone modification patterns. We applied the proposed methodology to Pokholok et al’s data with eight sets of histone modifications and Kurdistani et al’s data with eleven histone acetylation sites. Our method succeeds in revealing two different global views of histone modification landscapes on two datasets and identifying a number of modification patterns some of which are supported by previous studies. We concentrate on combinatorial effects of histone modifications which significantly affect gene expression. Our method succeeds in identifying known interactions among histone modifications and uncovering many previously unknown patterns. After in-depth analysis of possible mechanism by which histone modification patterns can alter transcriptional states, we infer three possible modification pattern reading mechanism (‘redundant’, ‘trivial’, ‘dominative’. Our results demonstrate several histone modification patterns which show significant correspondence between yeast and human cells.

  8. Adaptive Interval Configuration to Enhance Dynamic Approach for Mining Association Rules

    Institute of Scientific and Technical Information of China (English)

    1999-01-01

    Most proposed algorithms for mining association rules follow the conventional le vel-wise approach. The dynamic candidate generation idea introduced in the dyna mic itemset counting (DIC) a lgorithm broke away from the level-wise limitation which could find the large i t emsets using fewer passes over the database than level-wise algorithms. However , the dynamic approach is very sensitive to the data distribution of the database and it requires a proper interval size. In this paper an optimization technique named adaptive interval configuration (AIC) has been developed to enhance the d y namic approach. The AIC optimization has the following two functions. The first is that a homogeneous distribution of large itemsets over intervals can be achie ved so that less unnecessary candidates could be generated and less database sca nning passes are guaranteed. The second is that the near optimal interval size c ould be determined adaptively to produce the best response time. We also develop ed a candidate pruning technique named virtual partition pruning to reduce the s ize-2 candidate set and incorporated it into the AIC optimization. Based on the optimization technique, we proposed the efficient AIC algorithm for mining asso c iation rules. The algorithms of AIC, DIC and the classic Apriori were implemente d on a Sun Ultra Enterprise 4000 for performance comparison. The results show th at the AIC performed much better than both DIC and Apriori, and showed a strong robustness.

  9. 一种新的关联规则挖掘的模型%A New Model of Mining Association Rules

    Institute of Scientific and Technical Information of China (English)

    苏毅娟; 严小卫

    2001-01-01

    A new algorithm for mining positive and negative association rules is presented. A new confi-dence is constructed to measure the uncertainty of an association rule based on the probability theory and Piatetsky-Shapiro′s model.

  10. Generalized Multidimensional Association Rules

    Institute of Scientific and Technical Information of China (English)

    周傲英; 周水庚; 金文; 田增平

    2000-01-01

    The problem of association rule mining has gained considerable prominence in the data mining community for its use as an important tool of knowl-edge discovery from large-scale databases. And there has been a spurt of research activities around this problem. Traditional association rule mining is limited to intra-transaction. Only recently the concept of N-dimensional inter-transaction as-sociation rule (NDITAR) was proposed by H.J. Lu. This paper modifies and extends Lu's definition of NDITAR based on the analysis of its limitations, and the general-ized multidimensional association rule (GMDAR) is subsequently introduced, which is more general, flexible and reasonable than NDITAR.

  11. An association rule mining-based framework for understanding lifestyle risk behaviors.

    Directory of Open Access Journals (Sweden)

    So Hyun Park

    Full Text Available OBJECTIVES: This study investigated the prevalence and patterns of lifestyle risk behaviors in Korean adults. METHODS: We utilized data from the Fourth Korea National Health and Nutrition Examination Survey for 14,833 adults (>20 years of age. We used association rule mining to analyze patterns of lifestyle risk behaviors by characterizing non-adherence to public health recommendations related to the Alameda 7 health behaviors. The study variables were current smoking, heavy drinking, physical inactivity, obesity, inadequate sleep, breakfast skipping, and frequent snacking. RESULTS: Approximately 72% of Korean adults exhibited two or more lifestyle risk behaviors. Among women, current smoking, obesity, and breakfast skipping were associated with inadequate sleep. Among men, breakfast skipping with additional risk behaviors such as physical inactivity, obesity, and inadequate sleep was associated with current smoking. Current smoking with additional risk behaviors such as inadequate sleep or breakfast skipping was associated with physical inactivity. CONCLUSION: Lifestyle risk behaviors are intercorrelated in Korea. Information on patterns of lifestyle risk behaviors could assist in planning interventions targeted at multiple behaviors simultaneously.

  12. HIDING SENSITIVE ASSOCIATION RULE USING HEURISTIC APPROACH

    Directory of Open Access Journals (Sweden)

    Kasthuri S

    2013-01-01

    Full Text Available Data mining is the process of identifying patterns from large amount of data. Association rule mining aims to discover dependency relationships across attributes. It may also disclose sensitive information. With extensive application of data mining techniques to various domains, privacy preservation becomes mandatory. Association rule hiding is one of the techniques of privacy preserving data mining to protect the sensitive association rules generated by association rule mining. This paper adopts heuristic approach for hiding sensitive association rules. The proposed technique makes the representative rules and hides the sensitive rules.

  13. Research of an Improved Apriori Algorithm in Data Mining Association Rules

    OpenAIRE

    Jiao Yabing

    2013-01-01

    Apriori algorithm is the classic algorithm ofassociation rules, which enumerate all of the frequent item sets.When this algorithm encountered dense data due to the largenumber of long patterns emerge, this algorithm's performancedeclined dramatically. In order to find more valuable rules, thispaper proposes an improved algorithm of association rules, theclassical Apriori algorithm. Finally, the improved algorithm isverified, the results show that the improved algorithm isreasonable and effect...

  14. Creating a prediction model for weather forecasting based on artificial neural network supported by association rules mining

    OpenAIRE

    Kadlec, Jakub

    2016-01-01

    This diploma thesis focuses on creating a predictive model for the purpose of automated weather predictions based on a neural network. Attributes for input layer of the network are selected through association rules mining using the 4ft-Miner procedure. First part of the thesis consists of collection of theoretical knowledge enabling the creation of such predictive model, whereas the second part describes the creation of the model itself using the CRISP-DM methodology. Final part of the thesi...

  15. Selective association rule generation

    CERN Document Server

    Hahsler, Michael; Hornik, Kurt

    2008-01-01

    Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of ``interesting'' itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increa...

  16. A Novel Association Rule Mining with IEC Ratio Based Dissolved Gas Analysis for Fault Diagnosis of Power Transformers

    Directory of Open Access Journals (Sweden)

    Ms. Kanika Shrivastava

    2012-06-01

    Full Text Available Dissolved gas Analysis (DGA is the most importantcomponent of finding fault in large oil filledtransformers. Early detection of incipient faults intransformers reduces costly unplanned outages. Themost sensitive and reliable technique for evaluatingthe core of transformer is dissolved gas analysis. Inthis paper we evaluate different transformercondition on different cases. This paper usesdissolved gas analysis to study the history ofdifferent transformers in service, from whichdissolved combustible gases (DCG in oil are usedas a diagnostic tool for evaluating the condition ofthe transformer. Oil quality and dissolved gassestests are comparatively used for this purpose. In thispaper we present a novel approach which is basedon association rule mining and IEC ratio method.By using data mining concept we can categorizefaults based on single and multiple associations andalso map the percentage of fault. This is an efficientapproach for fault diagnosis of power transformerswhere we can find the fault in all obviousconditions. We use java for programming andcomparative study.

  17. GenMiner: mining non-redundant association rules from integrated gene expression data and annotations

    OpenAIRE

    Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

    2008-01-01

    International audience GenMiner is an implementation of association rule discovery dedicated to the analysis of genomic data. It allows the analysis of datasets integrating multiple sources of biological data represented as both discrete values, such as gene annotations, and continuous values, such as gene expression measures. GenMiner implements the new NorDi (normal discretization) algorithm for normalizing and discretizing continuous values and takes advantage of the JClose algorithm to...

  18. Multi-objective Numeric Association Rules Mining via Ant Colony Optimization for Continuous Domains without Specifying Minimum Support and Minimum Confidence

    Directory of Open Access Journals (Sweden)

    Parisa Moslehi

    2011-09-01

    Full Text Available Currently, all search algorithms which use discretization of numeric attributes for numeric association rule mining, work in the way that the original distribution of the numeric attributes will be lost. This issue leads to loss of information, so that the association rules which are generated through this process are not precise and accurate. Based on this fact, algorithms which can natively handle numeric attributes would be interesting. Since association rule mining can be considered as a multi-objective problem, rather than a single objective one, a new multi-objective algorithm for numeric association rule mining is presented in this paper, using Ant Colony Optimization for Continuous domains (ACOR. This algorithm mines numeric association rules without any need to specify minimum support and minimum confidence, in one step. In order to do this we modified ACOR for generating rules. The results show that we have more precise and accurate rules after applying this algorithm and the number of rules is more than the ones resulted from previous works.

  19. A PROPOSAL OF FUZZY MULTIDIMENSIONAL ASSOCIATION RULES

    Directory of Open Access Journals (Sweden)

    Rolly Intan

    2006-01-01

    Full Text Available Association rules that involve two or more dimensions or predicates can be referred as multidimensional association rules. Rather than searching for frequent itemsets (as is done in mining single-dimensional association rules, in multidimensional association rules, we search for frequent predicate sets. In general, there are two types of multidimensional association rules, namely interdimension association rules and hybrid-dimension association rules. Interdimension association rules are multidimensional association rules with no repeated predicates. This paper introduces a method for generating interdimension association rules. A more meaningful association rules can be provided by generalizing crisp value of attributes to be fuzzy value. To generate the multidimensional association rules implying fuzzy value, this paper introduces an alternative method for mining the rules by searching for the predicate sets.

  20. Mining time-series association rules from Western Pacific spatial-temporal data

    International Nuclear Information System (INIS)

    With increasing concerns about the environmental problem as well as tremendous environmental issues impacting on our daily life, a new requirement for analysis of environmental changes and effect has been proposed. In this paper we use Western Pacific events and basic background database as its data source to find the association between different marine parameters. The improved Apriori algorithm is utilized to discover knowledge in magnanimous spatio-temporal data. There are two main steps. First is according to the different variation degree of each point, the study area can be divided into lots of spatial-temporal transaction zones. Second is use the improved Apriori algorithm for spatial-temporal data mining. For the need of mining algorithm, the quantitative attributes need to be transformed into qualitative attributes. The concept generalization method is utilized to divide the original attribute data into several levels. Then the Apriori algorithm can be used to discover the potential association between marine parameters within the given time frame

  1. Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining

    KAUST Repository

    Boudellioua, Imane

    2016-07-08

    The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations.

  2. Analysis of Medical Domain Using CMARM: Confabulation Mapreduce Association Rule Mining Algorithm for Frequent and Rare Itemsets

    Directory of Open Access Journals (Sweden)

    Dr. Jyoti Gautam

    2015-11-01

    Full Text Available In Human Life span, disease is a major cause of illness and death in the modern society. There are various factors that are responsible for diseases like work environment, living and working conditions, agriculture and food production, housing, unemployment, individual life style etc. The early diagnosis of any disease that frequently and rarely occurs with the growing age can be helpful in curing the disease completely or to some extent. The long-term prognosis of patient records might be useful to find out the causes that are responsible for particular diseases. Therefore, human being can take early preventive measures to minimize the risk of diseases that may supervene with the growing age and hence increase the life expectancy chances. In this paper, a new CMARM: Confabulation-MapReduce based association rule mining algorithm is proposed for the analysis of medical data repository for both rare and frequent itemsets using an iterative MapReduce based framework inspired by cogency. Cogency is the probability of the assumed facts being true if the conclusion is true, means it is based on pairwise item conditional probability, so the proposed algorithm mine association rules by only one pass through the file. The proposed algorithm is also valuable for dealing with infrequent items due to its cogency inspired approach.

  3. Mining Association Rules in Big Data for E-healthcare Information System

    Directory of Open Access Journals (Sweden)

    N. Rajkumar

    2014-08-01

    Full Text Available Big data related to large volume, multiple ways of growing data sets and autonomous sources. Now the big data is quickly enlarged in many advanced domains, because of rapid growth in networking and data collection. The study is defining the E-Healthcare Information System, which needs to make logical and structural method of approaching the knowledge. And also effectually preparing and controlling the data generated during the diagnosis activities of medical application through sharing information among E-Healthcare Information System devices. The main objective is, A E-Healthcare Information System which is extensive, integrated knowledge system designed to control all the views of a hospital operation, such as medical data’s, administrative, financial, legal information’s and the corresponding service processing. At last the analysis of result will be generated using Association Mining Techniques which processed from big data of hospital information datasets. Finally mining techniques result could be evaluated in terms of accuracy, precision, recall and positive rate.

  4. Apriori and Ant Colony Optimization of Association Rules

    OpenAIRE

    Anshuman Singh Sadh; Nitin Shukla

    2013-01-01

    Association Rule mining is one of the important and most popular data mining technique. Association rule mining can be efficiently used in any decision making processor decision based rule generation. In this paper we present an efficient mining based optimization techniques for rule generation. By using apriori algorithm we find the positive and negative association rules. Then we apply ant colony optimization algorithm (ACO) for optimizing the association rules. Our results show the effecti...

  5. Discovering fuzzy spatial association rules

    Science.gov (United States)

    Kacar, Esen; Cicekli, Nihan K.

    2002-03-01

    Discovering interesting, implicit knowledge and general relationships in geographic information databases is very important to understand and use these spatial data. One of the methods for discovering this implicit knowledge is mining spatial association rules. A spatial association rule is a rule indicating certain association relationships among a set of spatial and possibly non-spatial predicates. In the mining process, data is organized in a hierarchical manner. However, in real-world applications it may not be possible to construct a crisp structure for this data, instead some fuzzy structures should be used. Fuzziness, i.e. partial belonging of an item to more than one sub-item in the hierarchy, could be applied to the data itself, and also to the hierarchy of spatial relations. This paper shows that, strong association rules can be mined from large spatial databases using fuzzy concept and spatial relation hierarchies.

  6. Hiding Sensitive Association Rule Using Clusters of Sensitive Association Rule

    Directory of Open Access Journals (Sweden)

    Sanjay keer

    2012-06-01

    Full Text Available The security of the large database that contains certain crucialinformation, it will become a serious issue when sharing data to thenetwork against unauthorized access. Association rules hidingalgorithms get strong and efficient performance for protectingconfidential and crucial data. The objective of the proposedAssociation rule hiding algorithm for privacy preserving datamining is to hide certain information so that they cannot bediscovered through association rule mining algorithm. The mainapproached of association rule hiding algorithms to hide somegenerated association rules, by increase or decrease the support orthe confidence of the rules. The association rule items whether inLeft Hand Side (LHS or Right Hand Side (RHS of the generatedrule, that cannot be deduced through association rule miningalgorithms. The concept of Increase Support of Left Hand Side(ISL algorithm is decrease the confidence of rule by increase thesupport value of LHS. It doesn’t work for both side of rule. Itworks only for modification of LHS. In this paper, we propose aheuristic algorithm named ISLRC (Increase Support of L.H.S. itemof Rule Clusters based on ISL approach to preserve privacy forsensitive association rules in database. Proposed algorithmmodifies fewer transactions and hides many rules at a time. Theefficiency of the proposed algorithm is compared with ISLalgorithms.

  7. Customer Requirements Mapping Method Based on Association Rule Mining for Mass Customization

    Institute of Scientific and Technical Information of China (English)

    XIA Shi-sheng; WANG Li-ya

    2008-01-01

    Customer requirements analysis is the key step for product variety design of mass customiza-tion(MC). Quality function deployment (QFD) is a widely used management technique for understanding thevoice of the customer (VOC), however, QFD depends heavily on human subject judgment during extractingcustomer requirements and determination of the importance weights of customer requirements. QFD pro-cess and related problems are so complicated that it is not easily used. In this paper, based on a generaldata structure of product family, generic bill of material (CBOM), association rules analysis was introducedto construct the classification mechanism between customer requirements and product architecture. The newmethod can map customer requirements to the items of product family architecture respectively, accomplishthe mapping process from customer domain to physical domain directly, and decrease mutual process betweencustomer and designer, improve the product design quality, and thus furthest satisfy customer needs. Finally,an example of customer requirements mapping of the elevator cabin was used to illustrate the proposed method.

  8. Pattern Discovery Using Association Rules

    Directory of Open Access Journals (Sweden)

    Ms Kiruthika M,

    2011-12-01

    Full Text Available The explosive growth of Internet has given rise to many websites which maintain large amount of user information. To utilize this information, identifying usage pattern of users is very important. Web usage mining is one of the processes of finding out this usage pattern and has many practical applications. Our paper discusses how association rules can be used to discover patterns in web usage mining. Our discussion starts with preprocessing of the given weblog, followed by clustering them and finding association rules. These rules provide knowledge that helps to improve website design, in advertising, web personalization etc.

  9. Mining Online Store Client Assessment Classification Rules with Genetic Algorithms

    OpenAIRE

    Galinina, A; Paršutins, S

    2011-01-01

    The paper presents the results of the research into algorithms that are not meant to mine classification rules, yet they contain all the necessary functions which allow us to use them for mining classification rules such as Genetic algorithm (GA). The main task of the research is associated with the application of GA to classification rule mining. A classic GA was modified to match the chosen classification task and was compared with other popular classification algorithms – JRip, J48 and Nai...

  10. Using Association Rule Mining for Extracting Product Sales Patterns in Retail Store Transactions

    OpenAIRE

    Pramod Prasad,; Dr. Latesh Malik

    2011-01-01

    Computers and software play an integral part in the working of businesses and organisations. An immense amount of data is generated with the use of software. These large datasets need to be analysed for useful information that would benefit organisations, businesses and individuals by supporting decision making and providing valuable knowledge. Data mining is an approach that aids in fulfilling this requirement. Data mining is the process of applying mathematical, statistical and machine lear...

  11. Association rule mining based study for identification of clinical parameters akin to occurrence of brain tumor

    OpenAIRE

    Dipankar SENGUPTA; Sood, Meemansa; Vijayvargia, Poorvika; Hota, Sunil; Naik, Pradeep K

    2013-01-01

    Healthcare sector is generating a large amount of information corresponding to diagnosis, disease identification and treatment of an individual. Mining knowledge and providing scientific decision-making for the diagnosis & treatment of disease from the clinical dataset is therefore increasingly becoming necessary. Aim of this study was to assess the applicability of knowledge discovery in brain tumor data warehouse, applying data mining techniques for investigation of clinical parameters that...

  12. DETERMINING THE CORE PART OF SOFTWARE DEVELOPMENT CURRICULUM APPLYING ASSOCIATION RULE MINING ON SOFTWARE JOB ADS IN TURKEY

    Directory of Open Access Journals (Sweden)

    Ilkay Yelmen

    2016-01-01

    Full Text Available The software technology is advancing rapidly over the years. In order to adapt to this advancement, the employees on software development should renew themselves consistently. During this rapid change, it is vital to train the proper software developer with respect to the criteria desired by the industry. Therefore, the curriculum of the programs related to software development at the universities should be revised according to software industry requirements. In this study, the core part of Software Development Curriculum is determined by applying association rule mining on Software Job ads in Turkey. The courses in the core part are chosen with respect to IEEE/ACM computer science curriculum. As a future study, it is also important to gather the academic personnel and the software company professionals to determine the compulsory and elective courses so that newly graduated software dev

  13. Association Rule Discovery and Its Applications

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Data mining, i.e. , mining knowledge from large amounts of data, is a demanding field since huge amounts of data have been collected in various applications. The collected data far exceed peoples ability to analyze it. Thus, some new and efficient methods are needed to discover knowledge from large database. Association rule discovery is an important problem in knowledge discovery and data mining.The association mining task consists of identifying the frequent item sets and then forming conditional implication rule among them. In this paper, we describe and summarize recent work on association rule discovery, offer a new method to association rule mining and point out that association rule discovery can be applied in spatial data mining. It is useful to discover knowledge from remote sensing and geographical information system.``

  14. Interestingness of association rules in data mining: Issues relevant to e-commerce

    Indian Academy of Sciences (India)

    Rajesh Natarajan; B Shekar

    2005-04-01

    The ubiquitous low-cost connectivity synonymous with the internet has changed the competitive business environment by dissolving traditional sources of competitive advantage based on size, location and the like. In this level playing field, firms are forced to compete on the basis of knowledge. Data mining tools and techniques provide e-commerce applications with novel and significant knowledge. This knowledge can be leveraged to gain competitive advantage. However, the automated nature of data mining algorithms may result in a glut of patterns – the sheer numbers of which contribute to incomprehensibility. Importance of automated methods that address this immensity problem, particularly with respect to practical application of data mining results, cannot be overstated. We first examine different approaches to address this problem citing their applicability to e-commerce whenever appropriate. We then provide a detailed survey of one important approach, namely interestingness measure, and discuss its relevance in e-commerce applications such as personalization in recommender systems. Study of current literature brings out important issues that reveal many promising avenues for future research. We conclude by reiterating the importance of post-processing methods in data mining for effective and efficient deployment of e-commerce solutions.

  15. Studying Co-evolution of Production and Test Code Using Association Rule Mining

    NARCIS (Netherlands)

    Lubsen, Z.; Zaidman, A.; Pinzger, M.

    2009-01-01

    Long version of the short paper accepted for publication in the proceedings of the 6th International Working Conference on Mining Software Repositories (MSR 2009). Unit tests are generally acknowledged as an important aid to produce high quality code, as they provide quick feedback to developers on

  16. Discovering Non-Redundant Association Rules using MinMax Approximation Rules

    Directory of Open Access Journals (Sweden)

    R. Vijaya Prakash

    2012-12-01

    Full Text Available Frequent pattern mining is an important area of data mining used to generate the Association Rules. The extracted Frequent Patterns quality is a big concern, as it generates huge sets of rules and many of them are redundant. Mining Non-Redundant Frequent patterns is a big concern in the area of Association rule mining. In this paper we proposed a method to eliminate the redundant Frequent patterns using MinMax rule approach, to generate the quality Association Rules.

  17. AN ENHANCED FREQUENT PATTERN GROWTH BASED ON MAPREDUCE FOR MINING ASSOCIATION RULES

    Directory of Open Access Journals (Sweden)

    ARKAN A. G. AL-HAMODI

    2016-03-01

    Full Text Available In mining frequent itemsets, one of most important algorithm is FP-growth. FP-growth proposes an algorithm to compress information needed for mining frequent itemsets in FP-tree and recursively constructs FP-trees to find all frequent itemsets. In this paper, we propose the EFP-growth (enhanced FPgrowth algorithm to achieve the quality of FP-growth. Our proposed method implemented the EFPGrowth based on MapReduce framework using Hadoop approach. New method has high achieving performance compared with the basic FP-Growth. The EFP-growth it can work with the large datasets to discovery frequent patterns in a transaction database. Based on our method, the execution time under different minimum supports is decreased..

  18. Studying Co-evolution of Production and Test Code Using Association Rule Mining

    OpenAIRE

    Lubsen, Z.; Zaidman, A.; Pinzger, M.

    2009-01-01

    Long version of the short paper accepted for publication in the proceedings of the 6th International Working Conference on Mining Software Repositories (MSR 2009). Unit tests are generally acknowledged as an important aid to produce high quality code, as they provide quick feedback to developers on the correctness of their code. In order to achieve high quality, well-maintained tests are needed. Ideally, tests co-evolve with the production code to test changes as soon as possible. In this pap...

  19. Identifying Combinatorial Biomarkers by Association Rule Mining in the CAMD Alzheimer's Database

    OpenAIRE

    Szalkai, Balazs; Grolmusz, Vince K.; Grolmusz, Vince I.; Diseases, Coalition Against Major

    2013-01-01

    Background: The concept of combinatorial biomarkers was conceived around 2010: it was noticed that simple biomarkers are often inadequate for recognizing and characterizing complex diseases. Methods: Here we present an algorithmic search method for complex biomarkers which may predict or indicate Alzheimer's disease (AD) and other kinds of dementia. We applied data mining techniques that are capable to uncover implication-like logical schemes with detailed quality scoring. Our program SCARF i...

  20. Association Rule Mining and Classifier Approach for 48-Hour Rainfall Prediction Over Cuddalore Station of East Coast of India

    OpenAIRE

    S. Meganathan; T.R. Sivaramakrishnan

    2013-01-01

    The methodology of data mining techniques has been presented for the rain forecasting models for the Cuddalore (11°43′ N/79°49′ E) station of Tamilnadu in East Coast of India. Data mining approaches like classification and association mining was applied to generate results for rain prediction before 48 hour of the actual occurrence of the rain. The objective of this study is to demonstrate what relationship models are there between various atmospheric variables and to interconnect these varia...

  1. Scalable Association Rule Mining with Predicates on Semantic Representations of Data

    Energy Technology Data Exchange (ETDEWEB)

    Tsay, Li-Shiang [ORNL; Sukumar, Sreenivas R [ORNL; Roberts, Larry W [ORNL

    2015-01-01

    Finding semantic associations from a vast amount of heterogeneous data is an important and useful task in various applications. We present a framework to extract semantic association patterns directly from a very large graph dataset without the extra step of converting graph data into transaction data.

  2. Association Rules in the Relational Calculus

    CERN Document Server

    Schulte, Oliver; Ester, Martin; Lu, Zhiyong

    2007-01-01

    One of the most utilized data mining tasks is the search for association rules. Association rules represent significant relationships between items in transactions. We extend the concept of association rule to represent a much broader class of associations, which we refer to as \\emph{entity-relationship rules.} Semantically, entity-relationship rules express associations between properties of related objects. Syntactically, these rules are based on a broad subclass of safe domain relational calculus queries. We propose a new definition of support and confidence for entity-relationship rules and for the frequency of entity-relationship queries. We prove that the definition of frequency satisfies standard probability axioms and the Apriori property.

  3. Application of Fuzzy Association Rule Mining for Analysing Students Academic Performance

    OpenAIRE

    Olufunke O. Oladipupo; Olanrewaju. J. Oyelade; Dada. O. Aborisade

    2012-01-01

    This study examines the relationship between students preadmission academic profile and academic performance. Data sample of students in the Department of Computer Science in one of Nigeria private Universities was used. The preadmission academic profile considered includes 'O' level grades, University Matriculation Examination (UME) scores, and Post-UME scores. The academic performance is defined using students Grade Point Average (GPA) at the end of a particular session. Fuzzy Association R...

  4. Association Rule Mining and Classifier Approach for 48-Hour Rainfall Prediction Over Cuddalore Station of East Coast of India

    Directory of Open Access Journals (Sweden)

    S. Meganathan

    2013-04-01

    Full Text Available The methodology of data mining techniques has been presented for the rain forecasting models for the Cuddalore (11°43′ N/79°49′ E station of Tamilnadu in East Coast of India. Data mining approaches like classification and association mining was applied to generate results for rain prediction before 48 hour of the actual occurrence of the rain. The objective of this study is to demonstrate what relationship models are there between various atmospheric variables and to interconnect these variables according to the pattern obtained out of data mining technique. Using this approach rainfall estimates can be obtained to support the decisions to launch cloud-seeding operations. There are 3 main parts in this study. First, the obtained raw data was filtered using discretization approach based on the best fit ranges. Then, association mining has been performed on it using Predictive Apriori algorithm. Thirdly, the data has been validated using K* classifier approach. Results show that the overall classification accuracy of the data mining technique is satisfactory

  5. Investment risk rules mining in insurance business data with association rules method%用关联规则方法挖掘保险业务数据中的投资风险规则

    Institute of Scientific and Technical Information of China (English)

    田金兰; 张素琴; 黄刚

    2001-01-01

    Insurance companies need to find the rules for applications and claims in insurance business data to make a profit. Mining for association rules is a simple and very useful data mining method. This paper introduces the definition of the association rule and it's four attributes: confidence, support, expected confidence and lift. Then some association rules are mined with SGI (Silicon Graphics Incorporati on) Mineset, a data mining tool. Some risk control rules are given that play important roles in the insurance company business. The association rules are widely used in the field of banking, electronic communication, commerce, etc.%如何找出保险业务数据中有关投保和理赔的规律是保险公司能否提高盈利的至关重要的问题。关联规则发现是数据挖掘技术的一种简单又很实用的方法。文章首先介绍了关联规则的定义以及关联规则的4个属性: 可信度、支持度、期望可信度和作用度。然后讲述了如何用SGI公司的数据挖掘工具Mineset在保险业务数据中发现关联规则,从而得出一些对保险公司起指导作用的控制投资风险的规则。关联规则还可广泛用于银行、电信、商业等其它领域。

  6. AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES

    OpenAIRE

    Wael AlZoubi

    2015-01-01

    This paper proposes an improved approach to mine strong association rules from an association graph, called graph based association rule mining (GBAR) method, where the association for each frequent itemset is represented by a sub-graph, then all sub-graphs are merged to determine association rules with high confidence and eliminate weak rules, the proposed graph based technique is self-motivated since it builds the association graph in a successive manner. These rules achieve the scalability...

  7. Mining Clinical Data using Minimal Predictive Rules

    OpenAIRE

    Batal, Iyad; Hauskrecht, Milos

    2010-01-01

    Modern hospitals and health-care institutes collect huge amounts of clinical data. Those who deal with such data know that there is a widening gap between data collection and data comprehension. Thus, it is very important to develop data mining techniques capable of automatically extracting useful knowledge to support clinical decision-making in various diagnostic and patient-management tasks. In this paper, we develop a new framework for rule mining based on minimal predictive rules (MPR). O...

  8. a Reliability Evaluation System of Association Rules

    Science.gov (United States)

    Chen, Jiangping; Feng, Wanshu; Luo, Minghai

    2016-06-01

    In mining association rules, the evaluation of the rules is a highly important work because it directly affects the usability and applicability of the output results of mining. In this paper, the concept of reliability was imported into the association rule evaluation. The reliability of association rules was defined as the accordance degree that reflects the rules of the mining data set. Such degree contains three levels of measurement, namely, accuracy, completeness, and consistency of rules. To show its effectiveness, the "accuracy-completeness-consistency" reliability evaluation system was applied to two extremely different data sets, namely, a basket simulation data set and a multi-source lightning data fusion. Results show that the reliability evaluation system works well in both simulation data set and the actual problem. The three-dimensional reliability evaluation can effectively detect the useless rules to be screened out and add the missing rules thereby improving the reliability of mining results. Furthermore, the proposed reliability evaluation system is applicable to many research fields; using the system in the analysis can facilitate obtainment of more accurate, complete, and consistent association rules.

  9. A Quick Algorithm for Mining Exceptional Rules

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Exceptional rules are often ignored because of their small support. However, they have high confidence, so they are useful sometimes. A new algorithm for mining exceptional rules is presented, which creates a large itemset from a relatively small database and scans the whole database only one time to generate all exceptional rules. This algorithm is proved to be quick and effective through its application in a mushroom database.

  10. Apriori Association Rule Algorithms using VMware Environment

    OpenAIRE

    R. Sumithra; Sujni Paul; D. Ponmary Pushpa Latha

    2014-01-01

    The aim of this study is to carry out a research in distributed data mining using cloud platform. Distributed Data mining becomes a vital component of big data analytics due to the development of network and distributed technology. Map-reduce hadoop framework is a very familiar concept in big data analytics. Association rule algorithm is one of the popular data mining techniques which finds the relationships between different transactions. A work has been executed using weighted apriori and h...

  11. Perspectives on Knowledge Discovery Algorithms Recently Introduced in Chemoinformatics: Rough Set Theory, Association Rule Mining, Emerging Patterns, and Formal Concept Analysis.

    Science.gov (United States)

    Gardiner, Eleanor J; Gillet, Valerie J

    2015-09-28

    Knowledge Discovery in Databases (KDD) refers to the use of methodologies from machine learning, pattern recognition, statistics, and other fields to extract knowledge from large collections of data, where the knowledge is not explicitly available as part of the database structure. In this paper, we describe four modern data mining techniques, Rough Set Theory (RST), Association Rule Mining (ARM), Emerging Pattern Mining (EP), and Formal Concept Analysis (FCA), and we have attempted to give an exhaustive list of their chemoinformatics applications. One of the main strengths of these methods is their descriptive ability. When used to derive rules, for example, in structure-activity relationships, the rules have clear physical meaning. This review has shown that there are close relationships between the methods. Often apparent differences lie in the way in which the problem under investigation has been formulated which can lead to the natural adoption of one or other method. For example, the idea of a structural alert, as a structure which is present in toxic and absent in nontoxic compounds, leads to the natural formulation of an Emerging Pattern search. Despite the similarities between the methods, each has its strengths. RST is useful for dealing with uncertain and noisy data. Its main chemoinformatics applications so far have been in feature extraction and feature reduction, the latter often as input to another data mining method, such as an Support Vector Machine (SVM). ARM has mostly been used for frequent subgraph mining. EP and FCA have both been used to mine both structural and nonstructural patterns for classification of both active and inactive molecules. Since their introduction in the 1980s and 1990s, RST, ARM, EP, and FCA have found wide-ranging applications, with many thousands of citations in Web of Science, but their adoption by the chemoinformatics community has been relatively slow. Advances, both in computer power and in algorithm development

  12. Discovery of Association Rules from University Admission System Data

    OpenAIRE

    Abdul Fattah Mashat; Mohammed M. Fouad; Yu, Philip S.; Tarek F. Gharib

    2013-01-01

    Association rules discovery is one of the vital data mining techniques. Currently there is an increasing interest in data mining and educational systems, making educational data mining (EDM) as a new growing research community. In this paper, we present a model for association rules discovery from King Abdulaziz University (KAU) admission system data. The main objective is to extract the rules and relations between admission system attributes for better analysis. The model utilizes an apriori...

  13. Apriori Association Rule Algorithms using VMware Environment

    Directory of Open Access Journals (Sweden)

    R. Sumithra

    2014-07-01

    Full Text Available The aim of this study is to carry out a research in distributed data mining using cloud platform. Distributed Data mining becomes a vital component of big data analytics due to the development of network and distributed technology. Map-reduce hadoop framework is a very familiar concept in big data analytics. Association rule algorithm is one of the popular data mining techniques which finds the relationships between different transactions. A work has been executed using weighted apriori and hash T apriori algorithms for association rule mining on a map reduce hadoop framework using a retail data set of transactions. This study describes the above concepts, explains the experiment carried out with retail data set on a VMW are environment and compares the performances of weighted apriori and hash-T apriori algorithms in terms of memory and time.

  14. Post-processing of association rules.

    OpenAIRE

    Baesens, Bart; Viaene, Stijn; Vanthienen, Jan

    2000-01-01

    In this paper, we situate and motivate the need for a post-processing phase to the association rule mining algorithm when plugged into the knowledge discovery in databases process. Major research effort has already been devoted to optimising the initially proposed mining algorithms. When it comes to effectively extrapolating the most interesting knowledge nuggets from the standard output of these algorithms, one is faced with an extreme challenge, since it is not uncommon to be confronted wit...

  15. Pair Triplet Association Rule Generation in Streams

    Directory of Open Access Journals (Sweden)

    Manisha Thool

    2013-08-01

    Full Text Available Many applications involve the generation and analysis of a new kind of data, called stream data, where data flows in and out of an observation platform or window dynamically. Such data streams have the unique features such as huge or possibly infinite volume, dynamically changing, flowing in or out in a fixed order, allowing only one or a small number of scans. An important problem in data stream mining is that of finding frequent items in the stream. This problem finds application across several domains such as financial systems, web traffic monitoring, internet advertising, retail and e-business. This raises new issues that need to be considered when developing association rule mining technique for stream data. The Space-Saving algorithm reports both frequent and top-k elements with tight guarantees on errors. We also develop the notion of association rules in streams of elements. The Streaming-Rules algorithm is integrated with Space-Saving algorithm to report 1-1 association rules with tight guarantees on errors, using minimal space, and limited processing per element and we are using Apriori algorithm for static datasets and generation of association rules and implement Streaming-Rules algorithm for pair, triplet association rules. We compare the top- rules of static datasets with output of stream datasets and find percentage of error.

  16. Flexible Rule Mining for Difference Rules and Exception Rules from Incomplete Database

    Science.gov (United States)

    Shimada, Kaoru; Hirasawa, Kotaro

    Two flexible rule mining methods from incomplete database are proposed using Genetic Network Programing (GNP). GNP is one of the evolutionary optimization techniques, which uses the directed graph structure. One of the methods extracts the rules showing the different characteristics between different classes in a database. The method can obtain the rules like 'if P then Q' is interesting only in the focusing class. The other one mines interesting rules like even if itemset X and Y have weak or no statistical relation to class item C, the join of X and Y has strong relation to class item C. An incomplete database includes missing data in some tuples. Generally, it is not easy for Apriori-like methods to extract difference rules and exception rules from incomplete database. We have estimated the performances of the rule extraction using incomplete data in the environmental and medical field.

  17. Association Rule Mining Strategy Based on Three-Sectional Coding Immune Genetic Algorithm%三段式编码的改进的IGA关联规则挖掘算法

    Institute of Scientific and Technical Information of China (English)

    王晓光; 张永健

    2014-01-01

    There are some shortcomings of low mining accuracy and falling into local convergence easily in the lat-est intelligence algorithm on the mining association rules mining. To solve these problems, a TIGA strategy was pro-pose. Firstly, a three-step encoding was used to encode continuous association rule mining in order to educe the seg-mentation point of mining influence. Secondly, an immune algorithm was used to mine the association rules. A multi-dimensional mining plan was proposed based on vector distance of genetic algorithm, which can increase the diversity of population and the accuracy of mining rules. Finally, the adaptive crossover and mutation factors were uses to re-duce the interference of artificial setting parameters on the mining results. The experimental results show that, com-pared with the latest mining algorithm, the proposed algorithm has the advantages of high precision and global conver-gence based on mining association rules.%最新智能算法在关联规则挖掘上存在挖掘精度低,易陷入局部收敛,运行时间较长等弊端,针对以上问题,提出了求解连续属性关联规则挖掘的三段式的改进的免疫遗传挖掘算法( TIIGA),首先使用三段式编码方案降低分割点的选取对挖掘的影响,其次提出了基于矢量矩浓度的TIIGA的选择方案,可以提高挖掘规则的多样性和挖掘的精度,最后使用了自适应的交叉与变异因子降低人工设置参数对挖掘结果的干扰。实验结果表明,与最新智能算法相比,提出的TIIGA算法在关联规则连续属性挖掘上具有挖掘精度高、全局收敛,挖掘时间短等优势。

  18. Interactive Postmining of Association Rules by Validating Ontologies

    OpenAIRE

    PILLALAMARRI ANUSHA; GINJALA SRIKANTH REDDY

    2012-01-01

    The Data Mining process enables the end users to analyze, understand and use the extracted knowledge in an intelligent system or to support in the decision-making processes Association refers to the data mining task of uncovering relationships among the data. In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. There are a number of ways to overcome this drawback that depends on the statistical information. But none of them guarantee t...

  19. 门诊处方药物关联的数据挖掘%Data mining the association rules in outpatient service prescriptions

    Institute of Scientific and Technical Information of China (English)

    傅翔; 杨樟卫; 陈盛新; 陈长虹; 何宇涛; 黄晓钟

    2011-01-01

    目的 对某医院门诊处方数据进行分析,挖掘处方中药物的关联规则,揭示处方模式,发现问题.方法 应用数据挖掘软件PASW(R)Modeler 13,建立Apriori关联分析模型.结果 在抽样获得的47 132张处方中,防治心血管等慢性疾病药物使用最为频繁;祛痰药、镇咳药、清热解毒中成药等与头孢菌素类抗菌药有较为明显的关联.结论 数据挖掘技术能较快速地处理和分析处方数据,反映处方模式,适用于当前药物利用研究中对大量数据的分析.%Objective To mine the association rules in and to identify the patterns of the prescription. Methods PASW Modeler 13 was applied to establish Apriori model and analyze the data. Results In 47 132 prescriptions, the drugs for prophylaxis and treatment of some chronic disease were present frequently. Expectorants, cough suppressants and prepared Chinese medicine for "Qing Re Jie Du" played dominant roles in the associations with cephalosporins. Conclusion The data mining technique was able to process and analyze prescription data effectively, which will be widely applicable to drug utilization research.

  20. Secure Medical Diagnosis Using Rule Based Mining

    Science.gov (United States)

    Saleem Durai, M. A.; Sriman Narayana Iyengar, N. Ch.

    Security is the governing dynamics of all walks of life. Here we propose a secured medical diagnosis system. Certain specific rules are specified implicitly by the designer of the expert system and then symptoms for the diseases are obtained from the users and by using the pre defined confidence and support values we extract a threshold value which is used to conclude on a particular disease and the stage using Rule Mining. "THINK" CAPTCHA mechanism is used to distinguish between the human and the robots thereby eliminating the robots and preventing them from creating fake accounts and spam's. A novel image encryption mechanism is designed using genetic algorithm to encrypt the medical images thereby storing and sending the image data in a secured manner.

  1. 基于关联规则挖掘的入侵检测算法研究%Research on the Intrusion Detection algorithm Based on the association rule mining

    Institute of Scientific and Technical Information of China (English)

    吴斌; 陆培军

    2012-01-01

    This paper analyzes the features of wireless network intrusion detection,proposed a new the association rule mining based on time windows.Theoretical analysis and experimental results show that the association rule mining algorithm is superior.It has achieved better results in the field of intrusion detection.%本文分析无线网络入侵检测的特点,提出来基于时间窗口关联规则挖掘算法。分析与实验结果表明,基于时间窗口关联规则挖掘算法在效率等方面更优越,在入侵检测中取得了较好的效果。

  2. 一种基于数据两方垂直分布的多维关联规则挖掘算法%AN ALGORITHM OF MULTIDIMENSIONAL ASSOCIATION RULES MINING BASED ON DATA VERTICALLY DISTRIBUTED IN TWO PARTS

    Institute of Scientific and Technical Information of China (English)

    李海磊; 王晗; 孔令富; 高慧星

    2014-01-01

    对垂直分布于不同站点的数据进行联合关联规则挖掘是一个重要的研究方向,然而已有的算法挖掘得到的都是全局单维关联规则,不能处理多维数据集并得到全局多维关联规则。针对此问题提出一种数据两方垂直分布条件下的多维关联规则挖掘算法TDDM(Two Part Vertically Distributed Data Mining),该算法结合数据立方体技术,直接在垂直分布于两方的数据上进行挖掘,得到多维关联规则。理论分析和实验结果表明,该算法可以有效挖掘数据两方垂直分布条件下的多维关联规则。%It is an important research direction that for the data vertically distributed in different parts the joint association rules mining is conducted.However what gained from the existing algorithms are all the global association rules in single dimension,and they cannot deal with the multidimensional data and get multidimensional global association rules.To solve this problem,we propose a new algorithm TDDM (two-part vertically distributed data mining ),it is a multidimensional association rules mining algorithm under the condition of data distributed vertically on two parts.Combining the technology of data cube,the algorithm directly mine the data distributed vertically on two parts and obtains the multidimensional association rules.Theoretical analysis and experimental results show that the TDDM can effectively mine the multidimensional association rules under the condition of data vertically distributed on two parts.

  3. Comparison of New Multilevel Association Rule Algorithm with MAFIA

    OpenAIRE

    Arpna Shrivastava; Jain, R. C.; Ajay Kumar Shrivastava

    2014-01-01

    Multilevel association rules provide the more precise and specific information. Apriori algorithm is an established algorithm for finding association rules. Fast Apriori implementation is modified to develop new algorithm for finding frequent item sets and mining multilevel association rules. MAFIA is another established algorithm for finding frequent item sets. In this paper, the performance of this new algorithm is analyzed and compared with MAFIA algorithm.

  4. Comparison of New Multilevel Association Rule Algorithm with MAFIA

    Directory of Open Access Journals (Sweden)

    Arpna Shrivastava

    2014-10-01

    Full Text Available Multilevel association rules provide the more precise and specific information. Apriori algorithm is an established algorithm for finding association rules. Fast Apriori implementation is modified to develop new algorithm for finding frequent item sets and mining multilevel association rules. MAFIA is another established algorithm for finding frequent item sets. In this paper, the performance of this new algorithm is analyzed and compared with MAFIA algorithm.

  5. New game - new rules: mining in the democratic South Africa

    Energy Technology Data Exchange (ETDEWEB)

    Motlatsi, J. [National Union of Mineworkers (South Africa)

    1995-12-31

    Discusses the eight areas identified by the South African Union of Mineworkers as requiring new rules to improve safety and conditions in the South African mining industry. The areas are: improved health and safety; the elimination of racism; fair wages; decent living conditions; proper training; care for workers and areas affected by the downscaling of mining; development of an economically viable mining sector; and a mining sector run on a humane and participatory manner.

  6. An Interactive System using Association Rule Discovery for Dyeing Processing System

    Directory of Open Access Journals (Sweden)

    Rama Sree .R.J

    2011-09-01

    Full Text Available This paper uses prior domain knowledge to guide the mining of association rules in the dyeing business process environment. This approach is used in order to overcome the drawbacks of data mining using rule induction such as loss of information, discover too many obvious patterns, and mining of overwhelmed association rules. A data mining interactive rule induction algorithm is introduced to mine rules at micro levels. The mined rules describe the impact of different shades of the colours, originator of the treatment, treatment details to improve the dyeing process quality and production growth. A system was built based on this algorithm and was tested and verified on real data set in Emerald Dyeing unit, which is the leading dyeing industry in Andhra Pradesh, India. Hence, this paper contributes more on to derive simple interactive system called process model using association rule mining algorithm for dyeing processing system.

  7. Towards an incremental maintenance of cyclic association rules

    CERN Document Server

    Ahmed, Eya ben

    2010-01-01

    Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an incremental algorithm for cyclic association rules maintenance. The carried out experiments of our proposal stress on its efficiency and performance.

  8. Association Rule Generation Using Apriori Mend Algorithm for Students Placement

    OpenAIRE

    Magdalene Delighta Angeline; Samuel Peter James

    2012-01-01

    The association rules are used to find interesting rules from large collections of data which expresses an association between items or sets of items. The usefulness of this technique is to address typical data mining problems is best. In order to show the effective relation of data, student placement was chosen and experiments were carried out which shows the best rules with 92.86% confidence while comparing with the previous Apriori approach. In this paper Apriori Mend algorithm was discuss...

  9. Bayesian Belief Network untuk Menghasilkan Fuzzy Association Rules

    OpenAIRE

    Rolly Intan; Oviliani Yenty Yuliana; Dwi Kristanto

    2010-01-01

    Bayesian Belief Network (BBN), one of the data mining classification methods, is used in this research for mining and analyzing medical track record from a relational data table. In this paper, a mutual information concept is extended using fuzzy labels for determining the relation between two fuzzy nodes. The highest fuzzy information gain is used for mining fuzzy association rules in order to extend a BBN. Meaningful fuzzy labels can be defined for each domain data. For example, fuzzy label...

  10. Revealing Significant Relations between Chemical/Biological Features and Activity: Associative Classification Mining for Drug Discovery

    Science.gov (United States)

    Yu, Pulan

    2012-01-01

    Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…

  11. Post-Processing of Discovered Association Rules Using Ontologies

    OpenAIRE

    Marinica, Claudia; Guillet, Fabrice; Briand, Henri

    2009-01-01

    In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. In this paper we propose a new approach to prune and filter discovered rules. Using Domain Ontologies, we strengthen the integration of user knowledge in the post-processing task. Furthermore, an interactive and iterative framework is designed to assist the user along the analyzing task. On the one hand, we represent user domain knowledge using a Domain Ontology over database. On the...

  12. Observational Calculi and Association Rules

    CERN Document Server

    Rauch, Jan

    2013-01-01

    Observational calculi were introduced in the 1960’s as a tool of logic of discovery. Formulas of observational calculi correspond to assertions on analysed data. Truthfulness of suitable assertions can lead to acceptance of new scientific hypotheses. The general goal was to automate the process of discovery of scientific knowledge using mathematical logic and statistics. The GUHA method for producing true formulas of observational calculi relevant to the given problem of scientific discovery was developed. Theoretically interesting and practically important results on observational calculi were achieved. Special attention was paid to formulas - couples of Boolean attributes derived from columns of the analysed data matrix. Association rules introduced in the 1990’s can be seen as a special case of such formulas. New results on logical calculi and association rules were achieved. They can be seen as a logic of association rules. This can contribute to solving contemporary challenging problems of data minin...

  13. 聚类与关联规则在信息舞弊识别中的应用%The Application of Clustering and Associate Rule Mining to Fraud Information Identification

    Institute of Scientific and Technical Information of China (English)

    幸莉仙; 黄慧连

    2012-01-01

    针对现代电子数据迅速膨胀,传统的审计方式已经无法应对海量的业务数据,试图将数据挖掘中的聚类和关联规则算法引入审计领域.在研究聚类与关联规则算法的含义及相关算法—K-Means和Apriori算法的基础上,提出了一种基于聚类与关联规则的审计模型,并以某市城镇医疗保险的审计为例,首先利用聚类分析进行数据筛选,然后利用关联规则挖掘海量数据之间潜在的关系,为审计提供线索.文章通过案例分析为数据挖掘在信息舞弊识别领域的应用提供参考.%Considering that with the rapid expansion of electronic data, the traditional audit approachs can not cope with vast business data, this paper intend to introduce the Clustering and Association Rule Mining in the audit fields. Based on the study of the meaning of Clustering and Association Rule Mining and their Algorithm—K-Means and Apriori, this article proposed an audit model which is based on the Clustering and Association Rule Mining, at the same time, taking the audit of medical insurance of some a city as an example, it detailed first how to use the Clustering to filter data, then how to mining the potential relationships in vast data so as to determine the audit priorities and audit clues.Through the case, the article is committed to provide a reference for the application of data mining in the fraud information identification.

  14. SAS: Implementation of scaled association rules on spatial multidimensional quantitative dataset

    Directory of Open Access Journals (Sweden)

    M. N. Doja

    2012-09-01

    Full Text Available Mining spatial association rules is one of the most important branches in the field of Spatial Data Mining (SDM. Because of the complexity of spatial data, a traditional method in extracting spatial association rules is to transform spatial database into general transaction database. The Apriori algorithm is one of the most commonly used methods in mining association rules at present. But a shortcoming of the algorithm is that its performance on the large database is inefficient. The present paper proposed a new algorithm by extracting maximum frequent itemsets based on spatial multidimensional quantitative dataset. Algorithms for mining spatial association rules are similar to association rule mining except consideration of special data, the predicates generation and rule generation processes are based on Apriori. The proposed method (SAS Scaled Aprori on Spatial multidimensional quantitative dataset in the paper reduces the number of itemsets generated and also improves the execution time of the algorithm.

  15. Sanitizing sensitive association rules using fuzzy correlation scheme

    International Nuclear Information System (INIS)

    Data mining is used to extract useful information hidden in the data. Sometimes this extraction of information leads to revealing sensitive information. Privacy preservation in Data Mining is a process of sanitizing sensitive information. This research focuses on sanitizing sensitive rules discovered in quantitative data. The proposed scheme, Privacy Preserving in Fuzzy Association Rules (PPFAR) is based on fuzzy correlation analysis. In this work, fuzzy set concept is integrated with fuzzy correlation analysis and Apriori algorithm to mark interesting fuzzy association rules. The identified rules are called sensitive. For sanitization, we use modification technique where we substitute maximum value of fuzzy items with zero, which occurs most frequently. Experiments demonstrate that PPFAR method hides sensitive rules with minimum modifications. The technique also maintains the modified data's quality. The PPFAR scheme has applications in various domains e.g. temperature control, medical analysis, travel time prediction, genetic behavior prediction etc. We have validated the results on medical dataset. (author)

  16. The diagnostic rules of peripheral lung cancer preliminary study based on data mining technique

    Institute of Scientific and Technical Information of China (English)

    Yongqian Qiang; Youmin Guo; Xue Li; Qiuping Wang; Hao Chen; Duwu Cui

    2007-01-01

    Objective: To discuss the clinical and imaging diagnostic rules of peripheral lung cancer by data mining technique, and to explore new ideas in the diagnosis of peripheral lung cancer, and to obtain early-stage technology and knowledge support of computer-aided detecting (CAD). Methods: 58 cases of peripheral lung cancer confirmed by clinical pathology were collected. The data were imported into the database after the standardization of the clinical and CT findings attributes were identified. The data was studied comparatively based on Association Rules (AR) of the knowledge discovery process and the Rough Set (RS) reduction algorithm and Genetic Algorithm(GA) of the generic data analysis tool (ROSETTA), respectively. Results: The genetic classification algorithm of ROSETTA generates 5 000 or so diagnosis rules. The RS reduction algorithm of Johnson's Algorithm generates 51 diagnosis rules and the AR algorithm generates 123 diagnosis rules. Three data mining methods basically consider gender, age,cough, location, lobulation sign, shape, ground-glass density attributes as the main basis for the diagnosis of peripheral lung cancer. Conclusion: These diagnosis rules for peripheral lung cancer with three data mining technology is same as clinical diagnostic rules, and these rules also can be used to build the knowledge base of expert system. This study demonstrated the potential values of data mining technology in clinical imaging diagnosis and differential diagnosis.

  17. MAROR: Multi-Level Abstraction of Association Rule Using Ontology and Rule Schema

    Directory of Open Access Journals (Sweden)

    Salim Khiat

    2014-11-01

    Full Text Available Many large organizations have multiple databases distributed over different branches. Number of such organizations is increasing over time. Thus, it is necessary to study data mining on multiple databases. Most multi-databases mining (MDBM algorithms for association rules typically represent input patterns at a single level of abstraction. However, in many applications of association rules – e.g., Industrial discovery, users often need to explore a data set at multiple levels of abstraction, and from different points of view. Each point of view corresponds to set of beliefs (and representational commitments regarding the domain of interest. Using domain ontologies, we strengthen the integration of user knowledge in the mining and post-processing task. Furthermore, an interactive and iterative framework is designed to assist the user along the analyzing task at different levels. This paper formalizes the problem of association rules using ontologies in multi-database mining, describes an ontology-driven association rules algorithm to discoverer rules at multiple levels of abstraction and presents preliminary results in petroleum field to demonstrate the feasibility and applicability of this proposed approach.

  18. Medical images data mining using classification algorithm based on association rule%基于关联分类算法的医学图像数据挖掘

    Institute of Scientific and Technical Information of China (English)

    邓薇薇; 卢延鑫

    2012-01-01

    Objective In order to assist clinicians in diagnosis and treatment of brain disease,a classifier for medical images which contains tumora inside,based on association rule data mining techniques was constructed.Methtoods After a pre-processing phase of the medical images,the related features from those images were extracted and discretized as the input of association rule,then the medical images classifier was constructed by improved Apriori algorithm.Results The medical images classifier was constructed.The known type of medical images was utilized to train the classifier so as to mine the association rules that satisfy the constraint conditions.Then the brain tumor in an unknown type of medical image was classified by the classifier constructed.Conclusion Classification algorithm based on association rule can be effectively used in mining image features,and constructing an image classifier to identify benign or malignant tumors.%目的 利用关联分类算法,构造医学图像分类器,对未知类型的脑肿瘤图像进行自动判别和分类,以帮助临床医生进行脑疾病的诊断和治疗.方法 对医学图像经过预处理后进行特征提取,再将提取的特征离散化后放到事务数据库中作为关联分类规则的输入,然后利用改进的Apriori算法构造医学图像分类器.结果 构造了医学图像分类器,用已知类型的图像训练分类器挖掘满足约束条件的关联规则,然后利用发现的关联规则对未知类型的医学图像进行分类以判断脑肿瘤的良恶性.结论 利用关联分类算法可以有效地挖掘医学图像特征,进而构造图像分类器,实现脑肿瘤良恶性的自动判别.

  19. Price Adjustment by Mining Negative Association Rules%基于负关联规则挖掘的价格调整

    Institute of Scientific and Technical Information of China (English)

    黄发良; 郑小建; 张师超

    2006-01-01

    定制优良的产品价格是激烈竞争的市场中一个关键,基于负关联规则挖掘的技术提出一种新的定价方法,它可通过人力参与和完全自动两种方式进行,该方法具有易操作与易扩展的优点.实验表明该方法是有效的.%Well-determining product price has been a crucial problem in marketing competition. A novel pricing method based on negative association rules identified from past data is proposed, which is easily-manipulated and well-extended for end users. In our approach, an optimal price can be generated with two alternative strategies: human-assisted pricing strategy and automatic pricing strategy. In addition, an efficient algorithm for generating short negative association rules is devised. The results show that the approach is promising and efficient.

  20. Mining Rules from Electrical Load Time Series Data Set

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The mining of the rules from the electrical load time series data which are collected from the EMS (Energy Management System) is discussed. The data from the EMS are too huge and sophisticated to be understood and used by the power system engineer, while useful information is hidden in the electrical load data. The authors discuss the use of fuzzy linguistic summary as data mining method to induce the rules from the electrical load time series. The data preprocessing techniques are also discussed in the paper.

  1. Arabic Text Mining Using Rule Based Classification

    OpenAIRE

    Fadi Thabtah; Omar Gharaibeh; Rashid Al-Zubaidy

    2012-01-01

    A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studi...

  2. 关联规则挖掘在软件企业客户关系管理中的研究与应用%Research and Application of Association Rule Mining in Software Enterprise's Customer Relationship Management

    Institute of Scientific and Technical Information of China (English)

    杨盛苑; 张向娟

    2012-01-01

      关联规则挖掘是在数据挖掘研究中最为活跃的一种挖掘方法之一。该文利用关联规则的Apriori算法对软件企业的客户与商品之间的关系进行挖掘,发现商品间的潜在关系,指导决策者对不同的客户实施不同的营销策略,从而达到高质量的客户关系管理。%  Association rule mining is one of the most active algorithm in data mining.This paper use Apriori algorithm to mining the relationship between customers and products of software enterprise in order to find potential relationship between various products, according to which managers can use different marketing strategies to different customers, and achieve high quality cus⁃tomer relationship management.

  3. Application and Discussion of Data Mining Model Based on Microsoft Association Rules Algorithm%基于 MS 关联规则数据挖掘模型的应用与探讨

    Institute of Scientific and Technical Information of China (English)

    刘城霞

    2013-01-01

      文中研究了数据挖掘算法中的 MS 关联规则算法以及其在金融领域的应用.数据挖掘的作用就是要从海量的数据里找到有用的、潜在的信息,模型通过对客户账户及交易数据的过滤和深入挖掘,建立了一个为银行管理人员提供更好的智能决策和建议,为普通客户提供咨询的数据挖掘商业应用实例系统.系统的选择 Visual Studio. NET 2008进行客户端的开发,使用 ADOMD. NET 对象连接挖掘模型和建立预测目标,使用 Web 控件对展示模型的结果.客户通过输入一些个人属性以及办理业务的基本要求,查看所关心的支付情况、贷款数量和应办理的信用卡类型,银行可以针对用户的支付特点,提供相应的增值服务等.在整个实例系统的构建过程中,对关联规则模型的挖掘过程进行了详细的分析,促进了数据挖掘的应用实践.%The application of Microsoft association rules algorithm of data mining in financial field is discussed in this paper. The function of the data mining is mining useful and potential information from the massive data. A business data mining system is created based on Microsoft association rules algorithm,which can provide better decisions and recommendations for the bank through filtering and mining the customers' transaction information. The client part of the system is developed with the Visual Studio. NET 2008. And it uses the ob-jects of ADOMD. NET to associate the data warehouse and the interface and the Web controls to display the result of mining. By using the application system analyze the customer's attributes to predict the payment ability and credit card type. The bank also can supply more service based on the customer's interest. In the creation of the instance model system the whole program of data mining is introduced in detail and this helps the development of data mining's application.

  4. Recent Trends and Research Issues in Video Association Mining

    Directory of Open Access Journals (Sweden)

    Vijayakumar.V

    2011-12-01

    Full Text Available With the ever-growing digital libraries and video databases, it is increasingly important to understand andmine the knowledge from video database automatically. Discovering association rules between items in alarge video database plays a considerable role in the video data mining research areas. Based on theresearch and development in the past years, application of association rule mining is growing in differentdomains such as surveillance, meetings, broadcast news, sports, archives, movies, medical data, as well aspersonal and online media collections. The purpose of this paper is to provide general framework ofmining the association rules from video database. This article is also represents the research issues invideo association mining followed by the recent trends.

  5. Interactive Postmining of Association Rules by Validating Ontologies

    Directory of Open Access Journals (Sweden)

    PILLALAMARRI ANUSHA

    2012-06-01

    Full Text Available The Data Mining process enables the end users to analyze, understand and use the extracted knowledge in an intelligent system or to support in the decision-making processes Association refers to the data mining task of uncovering relationships among the data. In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. There are a number of ways to overcome this drawback that depends on the statistical information. But none of them guarantee that the rules extracted are interesting for user. Thus we introduced a new approach in which the user knowledge is taken into the consideration to extract the rules through user interactive system. We propose ontologies in order to improve the integration of user knowledge in the postprocessing task. The user can even edit or validate ontologies. We propose the Rule Schema formalism, extending the specification language for user expectations. The rules extracted are pruned and filtered. So that, voluminous set of rules were reduced to several dozens or less.

  6. 基于文化免疫克隆算法的关联规则挖掘研究%Mining association rules based on cultured immune clone algorithm

    Institute of Scientific and Technical Information of China (English)

    杨光军

    2013-01-01

      针对关联规则挖掘问题,给出一种基于文化免疫克隆算法的关联规则挖掘方法,该方法将免疫克隆算法嵌入到文化算法的框架中,采用双层进化机制,利用免疫克隆算法的智能搜索能力和文化算法信念空间形成的公共认知信念的引导挖掘规则。该方法重新给出了文化算法中状况知识和历史知识的描述,设计了一种变异算子,能够自适应调节变异尺度,提高免疫克隆算法全局搜索能力。实验表明,该算法的运行速度和所得关联规则的准确率优于免疫克隆算法。%For the association rules mining, a method of mining association rules based on cultured immune clone algorithm is proposed. This method uses two-layer evolutionary mechanism and embeds the immune clone algorithm in the culture algorithm framework. It uses the intelligent searching ability of the immune clone algorithm and the commonly accepted knowledge in the culture algorithm to guide the rules mining. The situational knowledge and history knowledge in the culture algorithm are rede-fined, and a new mutation operator is put forward. This operator has the adaptive adjustment of mutation measure to improve the global search ability of immune clone algorithm. The experiments show that the new algorithm is superior to immune clone algo-rithm in performance speed and the rules’accuracy.

  7. An Efficient Algorithm to Automated Discovery of Interesting Positive and Negative Association Rules

    Directory of Open Access Journals (Sweden)

    Ahmed Abdul-WahabAl-Opahi

    2015-06-01

    Full Text Available Association Rule mining is very efficient technique for finding strong relation between correlated data. The correlation of data gives meaning full extraction process. For the discovering frequent items and the mining of positive rules, a variety of algorithms are used such as Apriori algorithm and tree based algorithm. But these algorithms do not consider negation occurrence of the attribute in them and also these rules are not in infrequent form. The discovery of infrequent itemsets is far more difficult than their counterparts, that is, frequent itemsets. These problems include infrequent itemsets discovery and generation of interest negative association rules, and their huge number as compared with positive association rules. The interesting discovery of association rules is an important and active area within data mining research. In this paper, an efficient algorithm is proposed for discovering interesting positive and negative association rules from frequent and infrequent items. The experimental results show the usefulness and effectiveness of the proposed algorithm.

  8. Re-mining positive and negative association mining results

    OpenAIRE

    Demiriz, Ayhan; Ertek, Gürdal; Ertek, Gurdal; Atan, Tankut; Kula, Ufuk

    2010-01-01

    Positive and negative association mining are well-known and extensively studied data mining techniques to analyze market basket data. Efficient algorithms exist to find both types of association, separately or simultaneously. Association mining is performed by operating on the transaction data. Despite being an integral part of the transaction data, the pricing and time information has not been incorporated into market basket analysis so far, and additional attributes have been handled using ...

  9. A Method for Hiding Association rules with Minimum Changes in Database

    Directory of Open Access Journals (Sweden)

    Zahra Sheykhinezhad

    Full Text Available Privacy preserving data mining is a continues way for to use data mining, without disclosing private information. To prevent disclosure of sensitive information by data mining techniques, it is necessary to make changes to the data base. Association rules ...

  10. Remote Sensing Classification based on Improved Ant Colony Rules Mining Algorithm

    Directory of Open Access Journals (Sweden)

    Shuying Liu

    2014-09-01

    Full Text Available Data mining can uncover previously undetected relationships among data items using automated data analysis techniques. In data mining, association rule mining is a prevalent and well researched method for discovering useful relations between variables in large databases. This paper investigates the principle of traditional rule mining, which will produce more non-essential candidate sets when it reads data into candidate items. Particularly when it deals with massive data, if the minimum support and minimum confidence are relatively small, combinatorial explosion of frequent item sets will occur and computing power and storage space required are likely to exceed the limits of machine. A new ant colony algorithm based on conventional Ant-Miner algorithm is proposed and is used in rules mining. Measurement formula of effectiveness of the rules is improved and pheromone concentration update strategy is also carried out. The experiment results show that execution time of proposed algorithm is lower than traditional algorithm and has better execution time and accuracy

  11. Role of Interestingness Measures in CAR Rule Ordering for Associative Classifier: An Empirical Approach

    CERN Document Server

    Kannan, S

    2010-01-01

    Associative Classifier is a novel technique which is the integration of Association Rule Mining and Classification. The difficult task in building Associative Classifier model is the selection of relevant rules from a large number of class association rules (CARs). A very popular method of ordering rules for selection is based on confidence, support and antecedent size (CSA). Other methods are based on hybrid orderings in which CSA method is combined with other measures. In the present work, we study the effect of using different interestingness measures of Association rules in CAR rule ordering and selection for associative classifier.

  12. Extended Realization of Associative Property for Data Mining using Apriori Optimization Technique for Frequency Pattern Generation

    OpenAIRE

    Arpita Lodha; Charu Kavadia

    2013-01-01

    Mining Association rules in transactional or relational databases have recently attracted a lot of attention in databases communities. Associative Rule Mining as defined in [13] is a popular and well researched method for discovering interesting relationship between various items involved in large databases. Introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (POS) systems. Our objective is to find the Frequency ...

  13. Using the interestingness measure lift to generate association rules

    Directory of Open Access Journals (Sweden)

    Nada Hussein

    2015-04-01

    Full Text Available In this digital age, organizations have to deal with huge amounts of data, sometimes called Big Data. In recent years, the volume of data has increased substantially. Consequently, finding efficient and automated techniques for discovering useful patterns and relationships in the data becomes very important. In data mining, patterns and relationships can be represented in the form of association rules. Current techniques for discovering association rules rely on measures such as support for finding frequent patterns and confidence for finding association rules. A shortcoming of confidence is that it does not capture the correlation that exists between the left-hand side (LHS and the right-hand side (RHS of an association rule. On the other hand, the interestingness measure lift captures such as correlation in the sense that it tells us whether the LHS influences the RHS positively or negatively. Therefore, using Lift instead of confidence as a criteria for discovering association rules can be more effective. It also gives the user more choices in determining the kind of association rules to be discovered. This in turn helps to narrow down the search space and consequently, improves performance. In this paper, we describe a new approach for discovering association rules that is based on Lift and not based on confidence.

  14. A New Method for Generating All Positive and Negative Association Rules

    Directory of Open Access Journals (Sweden)

    Rupesh Dewang,

    2011-04-01

    Full Text Available Association Rule play very important role in recent scenario of data mining. But we have only generated positive rule, negative rule also useful in today data mining task. In this paper we are proposing “Anew method for generating all positive and negative Association Rules” (NRGA.NRGA generates all association rules which are hidden when we have applied Apriori Algorithm. For representation of NegativeRules we are giving new name of this rules as like: CNR, ANR, and ACNR. In this paper we are also modify Correlation coefficient (CRC equation, so all generate results are very promising. First we apply Apriori Algorithm for frequent itemset generation and that is also generate positive rules, after on frequent itemset we apply NRGA algorithm for all negative rules generation and optimize generated rules using Genetic Algorithm

  15. ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURISTIC ALGORITHMS

    Directory of Open Access Journals (Sweden)

    Roghayeh Saneifar

    2015-11-01

    Full Text Available According to the increase of using data mining techniques in improving educational systems operations, Educational Data Mining has been introduced as a new and fast growing research area. Educational Data Mining aims to analyze data in educational environments in order to solve educational research problems. In this paper a new associative classification technique has been proposed to predict students final performance. Despite of several machine learning approaches such as ANNs, SVMs, etc. associative classifiers maintain interpretability along with high accuracy. In this research work, we have employed Honeybee Colony Optimization and Particle Swarm Optimization to extract association rule for student performance prediction as a multi-objective classification problem. Results indicate that the proposed swarm based algorithm outperforms well-known classification techniques on student performance prediction classification problem.

  16. Significant cancer prevention factor extraction: an association rule discovery approach.

    Science.gov (United States)

    Nahar, Jesmin; Tickle, Kevin S; Ali, A B M Shawkat; Chen, Yi-Ping Phoebe

    2011-06-01

    Cancer is increasing the total number of unexpected deaths around the world. Until now, cancer research could not significantly contribute to a proper solution for the cancer patient, and as a result, the high death rate is uncontrolled. The present research aim is to extract the significant prevention factors for particular types of cancer. To find out the prevention factors, we first constructed a prevention factor data set with an extensive literature review on bladder, breast, cervical, lung, prostate and skin cancer. We subsequently employed three association rule mining algorithms, Apriori, Predictive apriori and Tertius algorithms in order to discover most of the significant prevention factors against these specific types of cancer. Experimental results illustrate that Apriori is the most useful association rule-mining algorithm to be used in the discovery of prevention factors. PMID:20703554

  17. Rough Set Model for Discovering Hybrid Association Rules

    OpenAIRE

    Pandey, Anjana; Pardasani, K. R.

    2009-01-01

    In this paper, the mining of hybrid association rules with rough set approach is investigated as the algorithm RSHAR.The RSHAR algorithm is constituted of two steps mainly. At first, to join the participant tables into a general table to generate the rules which is expressing the relationship between two or more domains that belong to several different tables in a database. Then we apply the mapping code on selected dimension, which can be added directly into the information system as one cer...

  18. 文本挖掘探索泌尿系感染中西医用药规律%Exploring the associated rules of traditional Chinese medicine and western medicine on urinary tract infection with text mining technique

    Institute of Scientific and Technical Information of China (English)

    陈文; 姜洋; 黄蕙莉; 孙玉香

    2013-01-01

    Objective To explore the associated rules between western medicine and traditional Chinese medicine (TCM) on urinary tract infection (UTI) with text mining technique.Methods The data set on UTI was downloaded from CBM database.The regularities of Chinese patent medicines (CPM),western medicines and the combination of CPM and western medicines on UTI were mined out by data slicing algorithm.The results were showed visually with Cytoscape2.8 software.Results The main function of CPM was focused on clearing heat and removing toxicity,promoting diuresis and relieving stranguria.For western medicine,antibacterial agents was often used and it was also frequently used together with CPM such as Sanjinpian.Conclusions Text mining approach provides an important method in the summary of the application regularity for disease in both TCM and western medicine.%目的 利用文本挖掘技术探索泌尿系感染中西医用药规律.方法 在中国生物医学文献服务系统中收集治疗泌尿系感染文献数据,采用基于敏感关键词频数统计的数据分层算法,挖掘泌尿系感染中成药、西药、中成药与西药联合应用规律,并利用Cytoscape2.8软件进行可视化展示.结果 中成药的应用以清热解毒、利尿通淋为主;西药以抗菌治疗为主;具有清热解毒、利尿通淋之功的中成药常与抗菌药联合应用.结论 文本挖掘能够比较客观地总结疾病用药规律,为临床应用提供有益的探索和参考.

  19. 统计分析及关联挖掘在大学生心理健康中的应用%Statistical Analysis and Association Rule Mining of Application in College Students’ Mental Health

    Institute of Scientific and Technical Information of China (English)

    亓文娟; 黄书城

    2014-01-01

    为深入了解影响大学生心理健康的主要因素以及心理症状之间的关系,以某高校2011级的学生心理测试数据为基础,采用统计分析和关联规则挖掘两种方法,从性别、学生干部、独生子女、来源地、家庭结构、家庭月收入等方面进行了分析研究,根据研究结果为高校开展大学生心理健康教育的规划、决策提供依据。%To better understand the relationship between the main factors affecting the mental health of college students as well as psychological symptoms between a university’s 2011’ students’ psychological test data, the research uses statistical analysis and association rule mining two species method. From gender, only-child or not, native place, student cadre or not, family structure, family’s monthly income to analysis research. According to the research results will help educators to get a deeper understanding of students’ mental health problems and provide a basis for them to make plans and decisions about college studnets’ psychological educaiton.

  20. Mine-associated wetlands as avian habitat

    International Nuclear Information System (INIS)

    Surveys for interior wetland birds at mine-associated emergent wetlands on coal surface mines in southern Illinois detected one state threatened and two state endangered species. Breeding by least bittern (Ixobrychus exilis) and common moorhen (Gallinula chloropus) was confirmed. Regional assessment of potential wetland bird habitat south of Illinois Interstate 64 identified a total of 8,109 ha of emergent stable water wetlands; 10% were associated with mining. Mine-associated wetlands with persistent hydrology and large expanses of emergent vegetation provide habitat that could potentially compensate for loss of natural wetlands in Illinois

  1. Implementation of the Associative Classification Algorithm and Format of Dataset in Context of Data Mining

    Directory of Open Access Journals (Sweden)

    Gajraj Singh

    2013-06-01

    Full Text Available Construction of classification models based on association rules. Although association rules have been predominantly used for data exploration and description, the interest in using them for prediction has rapidly increased in the data mining community. In order to mine only rules that can be used for classification, I had modified the well known association rule mining algorithm Apriori to handle user-defined input constraints. We considered constraints that require the presence/absence of particular items or that limit the number of items in the antecedents and/or the consequents of the rules. We developed a characterization of those item sets that will potentially form rules that satisfy the given constraints. This characterization allows us to prune during item set construction. This improves the time performance of item set construction. Using this characterization, we implemented a classification system based on association rules. Furthermore, I enhanced the algorithm by relaying on the typical support/confidence framework, and mining for the best possible rules above a user-defined minimum confidence and within a desired range for the number of rules[9]. This avoids long mining times that might produce large collections of rules with low predictive power.

  2. Rough Set Model for Discovering Hybrid Association Rules

    CERN Document Server

    Pandey, Anjana

    2009-01-01

    In this paper, the mining of hybrid association rules with rough set approach is investigated as the algorithm RSHAR.The RSHAR algorithm is constituted of two steps mainly. At first, to join the participant tables into a general table to generate the rules which is expressing the relationship between two or more domains that belong to several different tables in a database. Then we apply the mapping code on selected dimension, which can be added directly into the information system as one certain attribute. To find the association rules, frequent itemsets are generated in second step where candidate itemsets are generated through equivalence classes and also transforming the mapping code in to real dimensions. The searching method for candidate itemset is similar to apriori algorithm. The analysis of the performance of algorithm has been carried out.

  3. Bayesian Belief Network untuk Menghasilkan Fuzzy Association Rules

    Directory of Open Access Journals (Sweden)

    Rolly Intan

    2010-01-01

    Full Text Available Bayesian Belief Network (BBN, one of the data mining classification methods, is used in this research for mining and analyzing medical track record from a relational data table. In this paper, a mutual information concept is extended using fuzzy labels for determining the relation between two fuzzy nodes. The highest fuzzy information gain is used for mining fuzzy association rules in order to extend a BBN. Meaningful fuzzy labels can be defined for each domain data. For example, fuzzy labels of secondary disease and complication disease are defined for a disease classification. The implemented of the extended BBN in a application program gives a contribution for analyzing medical track record based on BBN graph and conditional probability tables.

  4. Software Defect Association Mining and Defect Correction Effort Prediction

    OpenAIRE

    Song, Q; Shepperd, MJ; Cartwright, MH; Mair, C.

    2006-01-01

    Much current software defect prediction work concentrates on the number of defects remaining in software system. In this paper, we present association rule mining based methods to predict defect associations and defect-correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. We applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The results s...

  5. ENSEMBLE APPROACH FOR RULE EXTRACTION IN DATA MINING.

    Directory of Open Access Journals (Sweden)

    HITESH NINAMA

    2013-06-01

    Full Text Available A major drawback with neural networks is that the models produced are opaque; i.e. they do not permit human inspection or understanding. A decision tree model, on the other hand, is regarded as comprehensible since it is transparent, making it possible for a human to follow and understand the logic behind a prediction. Although accuracy is the prioritized criterion for predictive modeling, the comprehensibility of the model is often very important. A comprehensible model makes it possible for the user to understand not only the model itself but also why individual predictions are made. Traditionally, most research papers focus on high accuracy, although thecomprehensibility criterion is often emphasized by business representatives. Clearly, comprehensibility is very important for data mining technique. Since techniques producing opaque models normally will obtain highest accuracy, it seems inevitable that the choice of technique is a direct trade-off between accuracy and comprehensibility. With this trade-off in mind, several researchers have tried to bridge the gap by introducing techniques for transforming opaque models into transparent models, keeping an acceptable accuracy. Most significant are the many attempts to extract rules from trained neural networks. And this technique of transforming opaque model into transparent model is called as Rule Extraction (RE. Within the machine learning research community it is, however, also well known that it is possible to obtain even higher accuracy, by combining several individual models into ensembles. The overall goal when creating an ensemble is to combine models that are highly accurate, but differ in their predictions. The common ensemble techniques are probably bagging, boosting and stacking, all of which can be applied to different types of models and perform both regression and classification. Most importantly; bagging, boosting and stacking will, almost always, increase predictive performance

  6. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  7. 加权模糊关联规则的研究%Research on Weighted Fuzzy Association Rules

    Institute of Scientific and Technical Information of China (English)

    陆建江

    2003-01-01

    Algorithms for mining quantitative association rules consider each attribute equally, but the attributes usu-ally have different importance. Two kinds of algorithms for mining the weighted fuzzy association rules are providedwith respect to two kinds of database. The first algorithm can effectively consider the importance of quantitative at-tributes, and considers that the importance of association rule is not increased with the amount of attributes in therule. The second algorithm not only considers the importance of quantitative attributes, but also considers that theimportance of association rule is increased with the amount of attributes in the rule.

  8. Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.

    Science.gov (United States)

    Agapito, Giuseppe; Milano, Marianna; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-01-01

    Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches. PMID:27045823

  9. On construction of partial association rules

    KAUST Repository

    Moshkov, Mikhail

    2009-01-01

    This paper is devoted to the study of approximate algorithms for minimization of partial association rule length. It is shown that under some natural assumptions on the class NP, a greedy algorithm is close to the best polynomial approximate algorithms for solving of this NP-hard problem. The paper contains various bounds on precision of the greedy algorithm, bounds on minimal length of rules based on an information obtained during greedy algorithm work, and results of the study of association rules for the most part of binary information systems. © 2009 Springer Berlin Heidelberg.

  10. EMCAR: Expert Multi Class Based on Association Rule

    Directory of Open Access Journals (Sweden)

    Wa'el Hadi

    2013-04-01

    Full Text Available Several experimental studies revealed that expert systems have been successfully applied in real world domains such as medical diagnoses, traffic control, and many others. However, one of the major drawbacks of classic expert systems is their reliance on human domain experts which require time, care, experience and accuracy. This shortcoming also may result in building knowledge bases that may contain inconsistent rules or contradicting rules. To treat the abovementioned we intend to propose and develop automated methods based on data mining called Associative Classification (AC that can be easily integrated into an expert system to produce the knowledge base according to hidden correlations in the input database. The methodology employed in the proposed expert system is based on learning the rules from the database rather than inputting the rules by the knowledge engineer from the domain expert and therefore, care and accuracy as well as processing time are improved. The proposed automated expert system contains a novel learning method based on AC mining that has been evaluated on Islamic textual data according to several evaluation measures including recall, precision and classification accuracy. Furthermore, five different classification approaches: Decision trees (C4.5, KNN, SVM, MCAR and NB and the proposed automated expert system have been tested on the Islamic data set to determine the suitable method in classifying Arabic texts.

  11. Investigation of Medication Rule in Wang Zhongqis Medical Records by Frequent Itemset Mining and Association Rule Learning%采用频繁集与关联规则挖掘《王仲奇医案》用药规律

    Institute of Scientific and Technical Information of China (English)

    张凯; 寿志勤; 郭亚光; 马宗华; 郑日新

    2013-01-01

    目的 研究用药规律并进行关联性分析,为临床用药提供参考.方法 以"咳血"、"虚劳"以及"湿温"医案为研究对象,通过分析医案信息结构以对医案原文进行数据预处理,构建数据库;通过整合Apriori关联规则算法,设计并实现"新安中医临证指导系统",完成数据挖掘结果的可视化,提供"临床查询应用"功能以及用药规律的关联性分析.结果 药物关联性分析结果显示治疗"咳血"的常用药物为丝瓜络、茜根以及牡丹皮等,其中核心药对为丝瓜络和茜根;治疗"虚劳"的常用药物为石斛、牡蛎及甘草等,其中核心药对为石斛和牡蛎;治疗"湿温"的常用药物为茯苓、佩兰及杏仁等,其中核心药对为茯苓、佩兰.结论 关联规则分析可用于挖掘医案的用药规律,本技术框架可应用于其他医籍的研究.%Objective To investigate the medication rule in Wang Zhongqi's Medical Records and conduct association rule analysis and to provide reference for clinical medication. Methods Taking "hemoptysis", "consumptive disease" , and "damp-warm syndrome" as the diseases for research, the information structure of medical records was analyzed to perform data preprocessing of the original text of medical records, so that the database of Wang ZhongqVs Medical Records was established. With the use of Apriori algorithm, the "Xin'an Traditional Chinese Medicine Clinical Guide System" was designed and created, so as to visualize the data mining results and provide the "application of clinical queries" and association rule analysis on medication rule. Results The association rule analysis showed that the common traditional Chinese medicines for treating hemoptysis were loofah sponge, Rubia cordi folia Radicis, Cortex Moutan Radicis , and so on, with loofah sponge and Rubia cordi folia Radicis as the core medicines; the common traditional Chinese medicines for treating consumptive disease were Dendrobium, Concha Ostreae

  12. Association-rule based information source selection

    OpenAIRE

    Yang, Hui; Zhang, Minjie; Shi, Zhongzhi

    2004-01-01

    The proliferation of information sources available on the Wide World Web has resulted in a need for database selection tools to locate the potential useful information sources with respect to the user's information need. Current database selection tools always treat each database independently, ignoring the implicit, useful associations between distributed databases. To overcome this shortcoming, in this paper, we introduce a data-mining approach to assist the process of database selection by...

  13. Reduction of Number of Association Rules with Inter Itemset Distance in Transaction Databases

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar Deva Sarma

    2012-11-01

    Full Text Available Association Rule discovery has been an important problem of investigation in knowledge discovery and data mining. An association rule describes associations among the sets of items which occur together in transactions of databases.The Association Rule mining task consists of finding the frequent itemsets and the rules in the form of conditional implications with respect to some prespecified threshold values of support and confidence.The interestingness of Association Rules are determined by these two measures. However,other measures of interestingness like lift and conviction are also used. But, there occurs an explosive growth of discovered association rules and many of such rules are insignificant. In this paper we introduce a new measure of interestingness called Inter Itemset Distance or Spread and implemented this notion based on the approaches of the apriori algorithm with a view to reduce the number of discovered Association Rules in a meaningful manner. An analysis of the working of the new algorithm is done and the results are presented and compared with the results of conventional apriori algorithm.

  14. Integrating association rules and case-based reasoning to predict retinopathy

    Directory of Open Access Journals (Sweden)

    Vimala Balakrishnan

    Full Text Available This study proposes a retinopathy prediction system based on data mining,particularly association rules using Apriori algorithm, and case-based reasoning. The association rules are used to analyse patterns in the data set and to calculate retinopathy probability whereas case-based reasoning is used to retrieve similar cases. This paper discusses the proposed system. It is believed that great improvements can be provided to medical practitioners and also to diabetics with the implementation of this system.

  15. Modified Approach for Hiding Sensitive Association Rules for Preserving Privacy in Database

    Directory of Open Access Journals (Sweden)

    Tania Banerjee

    2014-03-01

    Full Text Available Data mining is the process of analyzing large database to find useful patterns. The term pattern refers to the items which are frequently occurring in set of transaction. The frequent patterns are used to find association between sets of item. The efficiency of mining association rules and confidentiality of association rule is becoming one of important area of knowledge discovery in database. This paper is organized into two sections. In the system Apriori algorithm is being presented that efficiently generates association rules. These reduces unnecessary database scan at time of forming frequent large item sets .We have tried to give contribution to improved Apriori algorithm by hiding sensitive association rules which are generated by applying improved Apriori algorithm on supermarket database. In this paper we have used novel approach that strategically modifies few transactions in transaction database to decrease support and confidence of sensitive rule without producing any side effects. Thus in the paper we have efficiently generated frequent item set sets by applying Improved Apriori algorithm and generated association rules by applying minimum support and minimum confidence and then we went one step further to identify sensitive rules and tried to hide them without any side effects to maintain integrity of data without generating spurious rules.

  16. MAGDM-Miner: A New Algorithm for Mining Trapezoidal Intuitionistic Fuzzy Correlation Rules

    OpenAIRE

    Robinson, John P.; Henry Amirtharaj

    2014-01-01

    In this article, the authors propose a new framework called the MAGDM-Miner, for mining correlation rules from trapezoidal intuitionistic fuzzy data efficiently. In the MAGDM-Miner, the raw data from a Multiple Attribute Group Decision Making (MAGDM) problem with trapezoidal intuitionistic fuzzy data are first pre-processed using some arithmetic aggregation operators. The aggregated data in turn are processed for efficient data selection through fuzzy correlation rule mining where the unwante...

  17. 5 CFR 5201.105 - Additional rules for Mine Safety and Health Administration employees.

    Science.gov (United States)

    2010-01-01

    ... Health Administration employees. 5201.105 Section 5201.105 Administrative Personnel DEPARTMENT OF LABOR... for Mine Safety and Health Administration employees. The rules in this section apply to employees of the Mine Safety and Health Administration (MSHA) and are in addition to §§ 5201.101, 5201.102,...

  18. Re-mining association mining results through visualization, data envelopment analysis, and decision trees

    OpenAIRE

    Ertek, Gürdal; Ertek, Gurdal; Tunç, Murat Mustafa; Tunc, Murat Mustafa

    2012-01-01

    Re-mining is a general framework which suggests the execution of additional data mining steps based on the results of an original data mining process. This study investigates the multi-faceted re-mining of association mining results, develops and presents a practical methodology, and shows the applicability of the developed methodology through real world data. The methodology suggests re-mining using data visualization, data envelopment analysis, and decision trees. Six hypotheses, regarding ...

  19. Inter-transactional association rules for multi-dimensional contexts for prediction and their application to studying meteorological data

    NARCIS (Netherlands)

    Feng, Ling; Dillon, Tharam; Liu, James; Chen, P.P.

    2001-01-01

    Inter-transactional association rules, first presented in our early work [H. Lu, J. Han, L. Feng, Stock movement prediction and n-dimensional inter-transaction association rules, in: Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Seattle, Washington

  20. Analysis of Distributed and Adaptive Genetic Algorithm for Mining Interesting Classification Rules

    Institute of Scientific and Technical Information of China (English)

    YI Yunfei; LIN Fang; QIN Jun

    2008-01-01

    Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules. The paper gives the method to encode for the rules, the fitness function, the selecting, crossover, mutation and migration operator for the DAGA at the same time are designed.

  1. Recommendation System Based On Association Rules For Distributed E-Learning Management Systems

    Science.gov (United States)

    Mihai, Gabroveanu

    2015-09-01

    Traditional Learning Management Systems are installed on a single server where learning materials and user data are kept. To increase its performance, the Learning Management System can be installed on multiple servers; learning materials and user data could be distributed across these servers obtaining a Distributed Learning Management System. In this paper is proposed the prototype of a recommendation system based on association rules for Distributed Learning Management System. Information from LMS databases is analyzed using distributed data mining algorithms in order to extract the association rules. Then the extracted rules are used as inference rules to provide personalized recommendations. The quality of provided recommendations is improved because the rules used to make the inferences are more accurate, since these rules aggregate knowledge from all e-Learning systems included in Distributed Learning Management System.

  2. Analysis on Biological Information Network Multidimensional Data Mining Algorithm Based on Association Rule Mapping%基于关联规则映射的生物信息网络多维数据挖掘算法分析

    Institute of Scientific and Technical Information of China (English)

    吐尔逊江·托合提

    2015-01-01

    For the biological information network,want to a wide range of data mining is applied to the algorithm has many deifciencies,such as low precision,slow speed and memory,and so on.Based on this background,this paper proposes a data mining algorithm which can carry on the mapping and association for the biological information,this algorithm not only can be related to map,the network data sets,but also be based on the algorithm introduces the relative error,increase the accuracy of algorithm.By constructing the correlation between data sets,can distinguish the data within the space,achieve better effect of data mining.%要想对生物信息网络数据进行大范围的挖掘,所应用到的算法有很多不足之处,比如,精确度低,运行速度迟缓以及占内存等,基于这一背景,文章提出了一种能够对生物信息进行映射并关联的数据挖掘算法,这种算法不仅能够映射关联,确定网络数据集,还能够基于算法引入相对误差,使算法的精确性提高。通过构建数据集间的关联,能够对空间内的数据进行区分,达到更好的数据挖掘效果。

  3. How to Mine Information from Each Instance to Extract an Abbreviated and Credible Logical Rule

    Directory of Open Access Journals (Sweden)

    Limin Wang

    2014-10-01

    Full Text Available Decision trees are particularly promising in symbolic representation and reasoning due to their comprehensible nature, which resembles the hierarchical process of human decision making. However, their drawbacks, caused by the single-tree structure,cannot be ignored. A rigid decision path may cause the majority class to overwhelm otherclass when dealing with imbalanced data sets, and pruning removes not only superfluousnodes, but also subtrees. The proposed learning algorithm, flexible hybrid decision forest(FHDF, mines information implicated in each instance to form logical rules on the basis of a chain rule of local mutual information, then forms different decision tree structures and decision forests later. The most credible decision path from the decision forest can be selected to make a prediction. Furthermore, functional dependencies (FDs, which are extracted from the whole data set based on association rule analysis, perform embedded attribute selection to remove nodes rather than subtrees, thus helping to achieve different levels of knowledge representation and improve model comprehension in the framework of semi-supervised learning. Naive Bayes replaces the leaf nodes at the bottom of the tree hierarchy, where the conditional independence assumption may hold. This technique reduces the potential for overfitting and overtraining and improves the prediction quality and generalization. Experimental results on UCI data sets demonstrate the efficacy of the proposed approach.

  4. Mining for associations between text and brain activation in a functional neuroimaging database

    DEFF Research Database (Denmark)

    Nielsen, Finn Arup; Hansen, Lars Kai; Balslev, Daniela

    2004-01-01

    We describe a method for mining a neuroimaging database for associations between text and brain locations. The objective is to discover association rules between words indicative of cognitive function as described in abstracts of neuroscience papers and sets of reported stereotactic Talairach...

  5. Performance analysis of modified algorithm for finding multilevel association rules

    OpenAIRE

    Shrivastava, Arpna; Jain, R. C.

    2013-01-01

    Multilevel association rules explore the concept hierarchy at multiple levels which provides more specific information. Apriori algorithm explores the single level association rules. Many implementations are available of Apriori algorithm. Fast Apriori implementation is modified to develop new algorithm for finding multilevel association rules. In this study the performance of this new algorithm is analyzed in terms of running time in seconds.

  6. Mining Target-Oriented Fuzzy Correlation Rules to Optimize Telecom Service Management

    CERN Document Server

    Chueh, Hao-En

    2011-01-01

    To optimize telecom service management, it is necessary that information about telecom services is highly related to the most popular telecom service. To this end, we propose an algorithm for mining target-oriented fuzzy correlation rules. In this paper, we show that by using the fuzzy statistics analysis and the data mining technology, the target-oriented fuzzy correlation rules can be obtained from a given database. We conduct an experiment by using a sample database from a telecom service provider in Taiwan. Our work can be used to assist the telecom service provider in providing the appropriate services to the customers for better customer relationship management.

  7. Overlying strata movement rules and safety mining technology for the shallow depth seam proximity beneath a room mining goaf

    Institute of Scientific and Technical Information of China (English)

    Wang Fangtian; Zhang Cun; Zhang Xiaogang; Song Qi

    2015-01-01

    Aiming at the shallow depth seam proximity beneath a room mining goaf, due to that the shallow depth seam is exploited using the longwall mining and overlain by thin bedrock and thick loose sands, many accidents are likely to occur, including roof structure instability, roof step subsidence, damages of shield supports, and the face bumps triggered by the large area roof weighting, resulting in serious threats to the safety of underground miners and equipment. This paper analyses the overlying strata movement rules for the shallow seams using the physical simulation, the 3DEC numerical simulation and the field mea-surements. The results show that, in shallow seam mining, the overburden movement forms caved zone and fractured zone, the cracks develop continuously and reach the surface with the face advancing, and the development of surface cracks generally goes through four stages. With the application of loose blast-ing of residual pillars, reasonable mining height, and roof support and management, the safe, efficient and high recovery rate mining has been achieved in the shallow depth seam proximity beneath a room min-ing goaf.

  8. Greedy algorithms withweights for construction of partial association rules

    KAUST Repository

    Moshkov, Mikhail

    2009-09-10

    This paper is devoted to the study of approximate algorithms for minimization of the total weight of attributes occurring in partial association rules. We consider mainly greedy algorithms with weights for construction of rules. The paper contains bounds on precision of these algorithms and bounds on the minimal weight of partial association rules based on an information obtained during the greedy algorithm run.

  9. [Acupoints selection rules analysis of ancient acupuncture for urinary incontinence based on data mining technology].

    Science.gov (United States)

    Zhang, Wei; Tan, Zhigao; Cao, Juanshu; Gong, Houwu; Qin, Zuoai; Zhong, Feng; Cao, Yue; Wei, Yanrong

    2015-12-01

    Based on ancient literature of acupuncture in Canon of Chinese Medicine (4th edition), the articles regarding acupuncture for urinary incontinence were retrieved and collected to establish a database. By Weka data mining software, the multi-level association rules analysis method was applied to analyze the acupoints selection characteristics and rules of ancient acupuncture for treatment of urinary incontinence. Totally 356 articles of acupuncture for urinary incontinence were collected, involving 41 acupoints with a total frequency of 364. As a result, (1) the acupoints in the yin-meridian of hand and foot were highly valued, as the frequency of acupoints in yin-meridians was 2.6 times than that in yang-meridians, and the frequency of acupoints selected was the most in the liver meridian of foot-jueyin; (2) the acupoints in bladder meridian of foot-taiyang were also highly valued, and among three yang-meridians of foot, the frequency of acupoints in the bladder meridian of foot-taiyang was 54, accounting for 65.85% (54/82); (3) more acupoints selected were located in the lower limbs and abdomen; (4) specific acupoints in above meridians were mostly selected, presenting 73.2% (30/41) to the ratio of number and 79.4% (289/364) to the frequency, respectively; (5) Zhongji (CV 3), the front-mu point of bladder meridian, was seldom selected in the ancient acupuncture literature, which was different from modern literature reports. The results show that urinary incontinence belongs to external genitalia diseases, which should be treated from yin, indicating more yin-meridians be used and special acupoints be focused on. It is essential to focus inheritance and innovation in TCM clinical treatment, and applying data mining technology to ancient literature of acupuncture could provide classic theory basis for TCM clinical treatment. PMID:26964186

  10. Formal and Computational Properties of the Confidence Boost of Association Rules

    CERN Document Server

    Balcázar, José L

    2011-01-01

    Some existing notions of redundancy among association rules allow for a logical-style characterization and lead to irredundant bases of absolutely minimum size. One can push the intuition of redundancy further and find an intuitive notion of interest of an association rule, in terms of its "novelty" with respect to other rules. Namely: an irredundant rule is so because its confidence is higher than what the rest of the rules would suggest; then, one can ask: how much higher? We propose to measure such a sort of "novelty" through the confidence boost of a rule, which encompasses two previous similar notions (confidence width and rule blocking, of which the latter is closely related to the earlier measure "improvement"). Acting as a complement to confidence and support, the confidence boost helps to obtain small and crisp sets of mined association rules, and solves the well-known problem that, in certain cases, rules of negative correlation may pass the confidence bound. We analyze the properties of two version...

  11. Impact of Data Mining in Drought Monitoring

    OpenAIRE

    Anil Rajput; Ritu Soni; Ramesh Prasad Aharwal; Rajesh Sharma

    2011-01-01

    This paper focuses on association rule mining and decision tree classification in the rainfall and temperature data. We have used Apriori association rule mining algorithm with data mining tool WEKA. We have to try to generate Association rules and decision tree model.

  12. FSRM: A Fast Algorithm for Sequential Rule Mining

    Directory of Open Access Journals (Sweden)

    Anjali Paliwal

    2014-10-01

    Full Text Available Recent developments in computing and automation technologies have resulted in computerizing business and scientific applications in various areas. Turing the massive amounts of accumulated information into knowledge is attracting researchers in numerous domains as well as databases, machine learning, statistics, and so on. From the views of information researchers, the stress is on discovering meaningful patterns hidden in the massive data sets. Hence, a central issue for knowledge discovery in databases, additionally the main focus of this paper, is to develop economical and scalable mining algorithms as integrated tools for management systems.

  13. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    OpenAIRE

    Ujjwal Maulik; Saurav Mallik; Anirban Mukhopadhyay; Sanghamitra Bandyopadhyay

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the bio...

  14. Finding Exception For Association Rules Via SQL Queries

    Directory of Open Access Journals (Sweden)

    Luminita DUMITRIU

    2000-12-01

    Full Text Available Finding association rules is mainly based on generating larger and larger frequent set candidates, starting from frequent attributes in the database. The frequent sets can be organised as a part of a lattice of concepts according to the Formal Concept Analysis approach. Since the lattice construction is database contents-dependent, the pseudo-intents (see Formal Concept Analysis are avoided. Association rules between concept intents (closed sets A=>B are partial implication rules, meaning that there is some data supporting A and (not B; fully explaining the data requires finding exceptions for the association rules. The approach applies to Oracle databases, via SQL queries.

  15. Mining tree-query associations in graphs

    CERN Document Server

    Hoekx, Eveline

    2010-01-01

    New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasetsstructured as graphs. We introduce a novel class of tree-shapedpatterns called tree queries, and present algorithms for miningtree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can containconstants, and can contain existential nodes which are not counted when determining the number of occurrences of the patternin the data graph. Our algorithms have a number of provableoptimality properties, which are based on the theory of conjunctive database queries. We propose a practical, database-oriented implementation in SQL, and show that the approach works in practice through experiments on data about food webs, protein interactions, and citation analysis.

  16. Generalization-based discovery of spatial association rules with linguistic cloud models

    Institute of Scientific and Technical Information of China (English)

    杨斌; 田永青; 朱仲英

    2004-01-01

    Extraction of interesting and general spatial association rules from large spatial databases is an important task in the development of spatial database systems. In this paper, we investigate the generalization-based knowledge discovery mechanism that integrates attribute-oriented induction on nonspatial data and spatial merging and generalization on spatial data. Furthermore, we present linguistic cloud models for knowledge representation and uncertainty handling to enhance current generalization-based method. With these models, spatial and nonspatial attribute values are well generalized at higher-concept levels, allowing discovery of strong spatial association rules. Combining the cloud model based generalization method with Apriori algorithm for mining association rules from a spatial database shows the benefits in effectiveness and flexibility.

  17. Mining Compatibility Rules from Irregular Chinese Traditional Medicine Database by Apriori Agorithm

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 prescriptions for arrhythmia, including 535herbs (totally 10884 counts of herbs) were collected into the database. The prescription data were preprocessed through redundancy reduction, normalized storage, and knowledge induction according to the pretreatment demands of data mining. Then the Apriori algorithm was used to analyze the data and form the related technical rules and treatment procedures. The experimental result of compatibility of drugs for curing arrhythmia from the Chinese traditional medicine database shows that the prescription compatibility obtained by Apriori algorithm generally accords with the basic law of traditional Chinese medicine for arrhythmia. Some special compatibilities unreported were also discovered in the experiment, which may be used as the basis for developing new prescriptions for arrhythmia.

  18. RULE-BASE DATA MINING SYSTEMS FOR CUSTOMER QUERIES

    OpenAIRE

    A.Kaleeswaran, V.Ramasamy

    2012-01-01

    The main objective of this paper is tohave a best association between customer andorganization. This project is proposed in order todiscover knowledge from huge amount of data andto use the data efficiently because of great demand.Banking is the most commonly used application forfinancial section. In which, Enterprise ResourcePlanning (ERP) model is most widely used in orderto cost control, accounting and e-business &analyses. The order of the customers are routingautomatically to the next de...

  19. Statistical, Logic-Based, and Neural Networks Based Methods for Mining Rules from Data

    Czech Academy of Sciences Publication Activity Database

    Holeňa, Martin

    Dordrecht: Kluwer Academic Publishers, 2002 - (Hyder, A.; Shahbazian, E.; Waltz, E.), s. 511-532. (NATO Science Series). ISBN 1-4020-0722-1. [NATO Advanced study Institute on MSDF. Pitlochry (GB), 25.06.2000-07.07.2000] R&D Projects: GA AV ČR IAB2030007 Institutional research plan: AV0Z1030915 Keywords : data mining * integrative framework * observational logic * statistical hypotheses testing * rule extraction with artificial neural networks Subject RIV: BA - General Mathematics

  20. Analysis of obstruction reason of urban sewer using spatial association rules

    Science.gov (United States)

    Zhu, Hongmei; Luo, Yu

    2009-10-01

    Sewerage network is an important part of municipal infrastructure for a city. Obstruction of sewer causes street flooding and affects people's daily life directly. To investigate reasons why some sewage pipes are blocked frequently in Kunming, China, we employ spatial analysis and data mining technology to analyze the data on the basis of a municipal sewerage geographic information system of the city. In the GIS, all of map layers and attribute tables are organized and saved in a relational database with Geodatabase model. First, we combined SQL attribute query with spatial location query to find out the sewage pipes that are blocked frequently. Then, we carried out buffer analysis and intersect analysis on the layers of the frequently-blocked pipes and buildings along the streets to extract buildings that are close to these frequently-blocked pipes. Joining the buildings in the buffer scope and the frequently-blocked pipes forms a big table prepared for spatial data mining. We used Apriori algorithm to mine spatial association rules from the data in the big table in order to search implicit reasons of obstruction of the pipes. The results from data mining indicate that strong spatial and non-spatial associate rules exist between the obstruction and restaurants in the buildings, as well as attribute slopes and diameters of these sewage pipes.

  1. AN EVALUATION APPROACH FOR THE PROGRAM OF ASSOCIATION RULES ALGORITHM BASED ON METAMORPHIC RELATIONS

    Institute of Scientific and Technical Information of China (English)

    Zhang Jing; Hu Xuegang; Zhang Bin

    2011-01-01

    As data mining more and more popular applied in computer system,the quality assurance test of its software would be get more and more attention.However,because of the existence of the ‘oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.

  2. A GENERAL SURVEY ON FREQUENT PATTERN MINING USING GENETIC ALGORITHM

    OpenAIRE

    K. Poornamala; R. Lawrance

    2012-01-01

    In recent years, data mining is an important aspect for generating association rules among the large number of itemsets. Association rule mining is one of the techniques in data mining that that has two sub processes. First, the process called as finding frequent itemsets and second process is association rules mining. In this sub process, the rules with the use of frequent itemsets have been extracted. Researchers developed a lot of algorithms for finding frequent itemsets and association ru...

  3. Investigation of work zone crash casualty patterns using association rules.

    Science.gov (United States)

    Weng, Jinxian; Zhu, Jia-Zheng; Yan, Xuedong; Liu, Zhiyuan

    2016-07-01

    Investigation of the casualty crash characteristics and contributory factors is one of the high-priority issues in traffic safety analysis. In this paper, we propose a method based on association rules to analyze the characteristics and contributory factors of work zone crash casualties. A case study is conducted using the Michigan M-94/I-94/I-94BL/I-94BR work zone crash data from 2004 to 2008. The obtained association rules are divided into two parts including rules with high-lift, and rules with high-support for the further analysis. The results show that almost all the high-lift rules contain either environmental or occupant characteristics. The majority of association rules are centered on specific characteristics, such as drinking driving, the highway with more than 4 lanes, speed-limit over 40mph and not use of traffic control devices. It should be pointed out that some stronger associated rules were found in the high-support part. With the network visualization, the association rule method can provide more understandable results for investigating the patterns of work zone crash casualties. PMID:27038500

  4. An Object Extraction Model Using Association Rules and Dependence Analysis

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Extracting objects from legacy systems is a basic step insystem's obje ct-orientation to improve the maintainability and understandability of the syst e ms. A new object extraction model using association rules an d dependence analysis is proposed. In this model data are classified by associat ion rules and the corresponding operations are partitioned by dependence analysis.

  5. A Survey on Mining Algorithms

    OpenAIRE

    Patel Nimisha; Prof. Sheetal Mehta

    2013-01-01

    Data mining is a process that discover the knowledge or hidden pattern from large databases. In the large database using association rules throughfind meaningful relationship between large amount of itemsets and this itemset through create frequent itemset. Association rule mining is the most paramount application in the large database. Most of the Association rule mining algorithm are improved and derivative. The traditional algorithms scan databases many times so, time complexity and space ...

  6. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    Science.gov (United States)

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data

  7. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    Directory of Open Access Journals (Sweden)

    Ujjwal Maulik

    Full Text Available Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution. The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post

  8. MOCANAR: A MULTI-OBJECTIVE CUCKOO SEARCH ALGORITHM FOR NUMERIC ASSOCIATION RULE DISCOVERY

    Directory of Open Access Journals (Sweden)

    Irene Kahvazadeh

    2015-11-01

    Full Text Available Extracting association rules from numeric features involves searching a very large search space. To deal with this problem, in this paper a meta-heuristic algorithm is used that we have called MOCANAR. The MOCANAR is a Pareto based multi-objective cuckoo search algorithm which extracts high quality association rules from numeric datasets. The support, confidence, interestingness and comprehensibility are the objectives that have been considered in the MOCANAR. The MOCANAR extracts rules incrementally, in which, in each run of the algorithm, a small number of high quality rules are made. In this paper, a comprehensive taxonomy of metaheuristic algorithm have been presented. Using this taxonomy, we have decided to use a Cuckoo Search algorithm because this algorithm is one of the most matured algorithms and also, it is simple to use and easy to comprehend. In addition, until now, to our knowledge this method has not been used as a multi-objective algorithm and has not been used in the association rule mining area. To demonstrate the merit and associated benefits of the proposed methodology, the methodology has been applied to a number of datasets and high quality results in terms of the objectives were extracted.

  9. Fuzzy association rules for biological data analysis: A case study on yeast

    Directory of Open Access Journals (Sweden)

    Cano Carlos

    2008-02-01

    Full Text Available Abstract Background Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data. Results In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones. Conclusion An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.

  10. Rule mining and classification in a situation assessment application: a belief-theoretic approach for handling data imperfections.

    Science.gov (United States)

    Rohitha, K K; Hewawasam, G K; Premaratne, Kamal; Shyu, Mei-Ling

    2007-12-01

    Management of data imprecision and uncertainty has become increasingly important, especially in situation awareness and assessment applications where reliability of the decision-making process is critical (e.g., in military battlefields). These applications require the following: 1) an effective methodology for modeling data imperfections and 2) procedures for enabling knowledge discovery and quantifying and propagating partial or incomplete knowledge throughout the decision-making process. In this paper, using a Dempster-Shafer belief-theoretic relational database (DS-DB) that can conveniently represent a wider class of data imperfections, an association rule mining (ARM)-based classification algorithm possessing the desirable functionality is proposed. For this purpose, various ARM-related notions are revisited so that they could be applied in the presence of data imperfections. A data structure called belief itemset tree is used to efficiently extract frequent itemsets and generate association rules from the proposed DS-DB. This set of rules is used as the basis on which an unknown data record, whose attributes are represented via belief functions, is classified. These algorithms are validated on a simplified situation assessment scenario where sensor observations may have caused data imperfections in both attribute values and class labels. PMID:18179065

  11. Mining and sustainable development: environmental policies and programmes of mining industry associations

    International Nuclear Information System (INIS)

    Mining industry policies and practices have evolved rapidly in the environmental area, and more recently in the social area as well. Mining industry associations are using a variety of methods to stimulate and assist their member companies as they improve their environmental, social and economic performance. These associations provide opportunities for companies to use collaborative approaches in developing and applying improved technology, systems and practices (author)

  12. Multi-Scaling Sampling: An Adaptive Sampling Method for Discovering Approximate Association Rules

    Institute of Scientific and Technical Information of China (English)

    Cai-Yan Jia; Xie-Ping Gao

    2005-01-01

    One of the obstacles of the efficient association rule mining is the explosive expansion of data sets since it is costly or impossible to scan large databases, esp., for multiple times. A popular solution to improve the speed and scalability of the association rule mining is to do the algorithm on a random sample instead of the entire database. But how to effectively define and efficiently estimate the degree of error with respect to the outcome of the algorithm, and how to determine the sample size needed are entangling researches until now. In this paper, an effective and efficient algorithm is given based on the PAC (Probably Approximate Correct) learning theory to measure and estimate sample error. Then, a new adaptive, on-line, fast sampling strategy - multi-scaling sampling - is presented inspired by MRA (Multi-Resolution Analysis) and Shannon sampling theorem, for quickly obtaining acceptably approximate association rules at appropriate sample size. Both theoretical analysis and empirical study have showed that the sampling strategy can achieve a very good speed-accuracy trade-off.

  13. Association Rule Extraction from XML Stream Data for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Juryon Paik

    2014-07-01

    Full Text Available With the advances of wireless sensor networks, they yield massive volumes of disparate, dynamic and geographically-distributed and heterogeneous data. The data mining community has attempted to extract knowledge from the huge amount of data that they generate. However, previous mining work in WSNs has focused on supporting simple relational data structures, like one table per network, while there is a need for more complex data structures. This deficiency motivates XML, which is the current de facto format for the data exchange and modeling of a wide variety of data sources over the web, to be used in WSNs in order to encourage the interchangeability of heterogeneous types of sensors and systems. However, mining XML data for WSNs has two challenging issues: one is the endless data flow; and the other is the complex tree structure. In this paper, we present several new definitions and techniques related to association rule mining over XML data streams in WSNs. To the best of our knowledge, this work provides the first approach to mining XML stream data that generates frequent tree items without any redundancy.

  14. The spatiotempora variations rules of Songzao coal mining subsidence based on numerical simulation

    Science.gov (United States)

    Lu, J.; Li, Y.; Cheng, H.; Tang, Z.

    2015-11-01

    .0 m in 1999 was more than twice the area affected by subsidence in 2004. This in return, it was more than 7 times larger than the area affected by subsidence in 2009 of the one affected by subsidence in 2004. Extent of the area affected by the 2.5 m subsidence has also enlarged rapidly. This area has expanded by about 40 times in 2009 than its value in 2004. In addition, the area of subsidence of value 3.0 m has reached about 0.44 hm2 in 2009 from zero value. Finally, the fifth finding indicated that the overall extend of the mining subsidence was much more serious in southern than in northern side of the Songzao Mine. Moreover, it was indicated that the increasing rate of mining subsidence in the western side of the study area was as bigger as in the eastern side between 1999 and 2009. The spatiotemporal variations rules of songzao coal mining subsidence based on numerical simulation could provide reference for the subsequent subsidence prevention and land consolidation.

  15. Interval maps associated to the cellular automaton rule 184

    International Nuclear Information System (INIS)

    We associate to the cellular automaton elementary rule 184 an interval map defined in [0,1]. We show that this interval map is characterized by a functional equation which depends directly on the local rule and also depends on the choice to represent numbers in base 2. The functional equation is the analytical expression of the interval map self-similarity. We also compute a family of transition matrices which characterizes the effect of the interval map on a family of partitions of the interval [0,1]. We show how the family of matrices can be built with a recursive algorithm which depends on the local rule.

  16. Generating Rare Association Rules Using the Minimal Rare Itemsets Family

    OpenAIRE

    Szathmary, Laszlo; Valtchev, Petko; Napoli, Amedeo

    2010-01-01

    Rare association rules correspond to rare, or infrequent, itemsets, as opposed to frequent ones that are targeted by conventional pattern miners. Rare rules reflect regularities of local, rather than global, scope that can nevertheless provide valuable insights to an expert, especially in areas such as genetics and medical diagnosis where some specific deviations/illnesses occur only in a small number of cases. The work presented here is motivated by the long-standing open question of efficie...

  17. Analysis 320 coal mine accidents using structural equation modeling with unsafe conditions of the rules and regulations as exogenous variables.

    Science.gov (United States)

    Zhang, Yingyu; Shao, Wei; Zhang, Mengjia; Li, Hejun; Yin, Shijiu; Xu, Yingjun

    2016-07-01

    Mining has been historically considered as a naturally high-risk industry worldwide. Deaths caused by coal mine accidents are more than the sum of all other accidents in China. Statistics of 320 coal mine accidents in Shandong province show that all accidents contain indicators of "unsafe conditions of the rules and regulations" with a frequency of 1590, accounting for 74.3% of the total frequency of 2140. "Unsafe behaviors of the operator" is another important contributory factor, which mainly includes "operator error" and "venturing into dangerous places." A systems analysis approach was applied by using structural equation modeling (SEM) to examine the interactions between the contributory factors of coal mine accidents. The analysis of results leads to three conclusions. (i) "Unsafe conditions of the rules and regulations," affect the "unsafe behaviors of the operator," "unsafe conditions of the equipment," and "unsafe conditions of the environment." (ii) The three influencing factors of coal mine accidents (with the frequency of effect relation in descending order) are "lack of safety education and training," "rules and regulations of safety production responsibility," and "rules and regulations of supervision and inspection." (iii) The three influenced factors (with the frequency in descending order) of coal mine accidents are "venturing into dangerous places," "poor workplace environment," and "operator error." PMID:27085591

  18. Mining for associations between text and brain activation in a functional neuroimaging database

    DEFF Research Database (Denmark)

    Nielsen, Finn Årup; Hansen, Lars Kai; Balslev, D.

    2004-01-01

    We describe a method for mining a neuroimaging database for associations between text and brain locations. The objective is to discover association rules between words indicative of cognitive function as described in abstracts of neuroscience papers and sets of reported stereotactic Talairach...... coordinates. We invoke a simple probabilistic framework in which kernel density estimates are used to model distributions of brain activation foci conditioned on words in a given abstract. The principal associations are found in the joint probability density between words and voxels. We show that the...

  19. NIA2: A fast indirect association mining algorithm

    Institute of Scientific and Technical Information of China (English)

    NI Min; XU Xiao-fei; DENG Sheng-chun; WEN Xiao-xian

    2005-01-01

    Indirect association is a high level relationship between items and frequent item sets in data. There are many potential applications for indirect associations, such as database marketing, intelligent data analysis,web - log analysis, recommended system, etc. Existing indirect association mining algorithms are mostly based on the notion of post - processing of discovery of frequent item sets. In the mining process, all frequent item sets need to be generated first, and then they are filtered and joined to form indirect associations. We have presented an indirect association mining algorithm (NIA) based on anti - monotonicity of indirect associations whereas k candidate indirect associations can be generated directly from k - 1 candidate indirect associations,without all frequent item sets generated. We also use the frequent itempair support matrix to reduce the time and memory space needed by the algorithm. In this paper, a novel algorithm (NIA2) is introduced based on the generation of indirect association patterns between itempairs through one item mediator sets from frequent itempair support matrix. A notion of mediator set support threshold is also presented. NIA2 mines indirect association patterns directly from the dataset, without generating all frequent item sets. The frequent itempair support matrix and the notion of using tm as the support threshold for mediator sets can significantly reduce the cost of joint operations and the search process compared with existing algorithms. Results of experiments on a realword web log dataset have proved NIA2 one order of magnitude faster than existing algorithms.

  20. [Analysis on medication rules of state medical master Yan Zhenghua from prescriptions with citri reticulatae pericarpium based on data mining].

    Science.gov (United States)

    Wu, Jia-Rui; Guo, Wei-Xian; Zhang, Bing; Zhang, Xiao-Meng; Yang, Bing; Sheng, Xiao-Guang

    2014-02-01

    The prescriptions containing pericarpium citri reticulatae that built by Professor. Yan were collected to build a database based on traditional Chinese medicine (TCM) inheritance assist system. After analyzed by data mining, such as apriori algorithm, the frequency of single medicine, the frequency of drug combination, the association rules between drugs and core drug combinations can be get from the database. Through the analysis of 1 027 prescriptions with pericarpium citri reticulatae, these prescriptions were commonly used to treat stomach aches, cough and other syndromes. The most frequency drug combinations were "Citri Reticulatae Pericarpium-Poria", "Paeoniae Radix Rubra-Citri Reticulatae Pericarpium" and so on. The drug association rules that the confidence was 1 were "Glycyrrhizae Radix ex Rhizoma --> Citri Reticulatae Pericarpium", "Paeoniae Alba Radix-Cyperi Rhizoma --> Citri Reticulatae Pericarpium", "Poria --> Citri Reticulatae Pericarpium", and so on. The drugs in the prescriptions containing pericarpium citri reticulatae that built by Professor Yan mostly had the effects of regulating the flow of Qi and invigorate blood circulation, which reflected the clearly thought when making prescriptions. PMID:25204133

  1. ASSOCIATION RULES IN HORIZONTALLY DISTRIBUTED DATABASES WITH ENHANCED SECURE MINING

    OpenAIRE

    Sonal Patil; Harshad S. Patil

    2015-01-01

    Recent developments in information technology have made possible the collection and analysis of millions of transactions containing personal data. These data include shopping habits, criminal records, medical histories and credit records among others. In the term of distributed database, distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU controlled by a distributed database management system(together som...

  2. Design and implementation of data mining tools

    CERN Document Server

    Thuraisingham, Bhavani; Awad, Mamoun

    2009-01-01

    DATA MINING TECHNIQUES AND APPLICATIONS IntroductionTrendsData Mining Techniques and ApplicationsData Mining for Cyber Security: Intrusion DetectionData Mining for Web: Web Page Surfing PredictionData Mining for Multimedia: Image ClassificationOrganization of This BookNext StepsData Mining TechniquesIntroductionOverview of Data Mining Tasks and TechniquesArtificial Neural NetworksSupport Vector MachinesMarkov ModelAssociation Rule Mining (ARM)Multiclass ProblemImage MiningSummaryData Mining ApplicationsIntroductionIntrusion DetectionWeb Page Surfing PredictionImage ClassificationSummaryDATA MI

  3. Mining The Relationship Between Demographic Variables And Brand Associations

    OpenAIRE

    Dabbes, Ajayeb Abu; Kharbat, Faten

    2013-01-01

    This research aims to mine the relationship between demographic variables and brand associations, and study the relative importance of these variables. The study is conducted on fast-food restaurant brands chains in Jordan. The result ranks and evaluates the demographic variables in relation with the brand associations for the selected sample. Discovering brand associations according to demographic variables reveals many facts and linkages in the context of Jordanian culture. Suggestions are ...

  4. Re-mining item associations: methodology and a case study in apparel retailing

    OpenAIRE

    Demiriz, Ayhan; Ertek, Gürdal; Ertek, Gurdal; Atan, Tankut; Kula, Ufuk

    2011-01-01

    Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negat...

  5. Compact Tree for Associative Classification of Data Stream Mining

    Directory of Open Access Journals (Sweden)

    K.Prasanna Lakshmi

    2012-03-01

    Full Text Available The data streams have recently emerged to address the problems of continuous data. Mining with data streams is the process of extracting knowledge structures from continuous, rapid data records [1]. An important goal in data stream mining is generation of compact representation of data. This helps in reducing time and space needed for further decision making process. In this paper we propose a new scheme called Prefix Stream Tree (PST for associative classification. This helps in compact storage of data streams. This PSTree is generated in a single scan. This tree efficiently discovers the exact set of patterns from data streams using sliding window.

  6. CONTENT BASED MEDICAL IMAGE RETRIEVAL USING BINARY ASSOCIATION RULES

    OpenAIRE

    Akila; Uma Maheswari

    2013-01-01

    In this study, we propose a content-based medical image retrieval framework based on binary association rules to augment the results of medical image diagnosis, for supporting clinical decision making. Specifically, this work is employed on scanned Magnetic Resonance brain Images (MRI) and the proposed Content Based Image Retrieval (CBIR) process is for enhancing relevancy rate of retrieved images. The pertinent features of a query brain image are extracted by applying third order moment inva...

  7. New Framework in Sensitive Rule Hiding

    Directory of Open Access Journals (Sweden)

    A.S. NAVEENKUMAR

    2012-01-01

    Full Text Available Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information. Privacy preserving data mining is a novel research direction in data mining and statistical databases, which has recently been proposed in response to the concerns of preserving personal or sensible information derived from data mining algorithms. There have been two types of privacy proposed concerning data mining. The first type of privacy, called output privacy, is that the data is altered so that the mining result will preserve certain privacy. The second type of privacy, called input privacy, is that the data is manipulated so that the mining result is not affected or minimally affected. For output privacy in hiding association rules, current approaches require hidden rules or patterns to be given in advance. However, to specify hidden rules, entire data mining process needs to be executed. For some applications, only certain sensitive rules that contain sensitive items are required to hide. In this work, an algorithm ISSRH (Increase Support Sensitive Rule Hiding is proposed, to hide the sensitive rules that contain sensitive items, so that sensitive rules containing specified sensitive items on the right hand side of the rule cannot be inferred through association rule mining.

  8. Association rule analysis for the assessment of the risk of coronary heart events.

    Science.gov (United States)

    Karaolis, M; Moutiris, J A; Papaconstantinou, L; Pattichis, C S

    2009-01-01

    Although significant progress has been made in the diagnosis and treatment of coronary heart disease (CHD), further investigation is still needed. The objective of this study was to develop a data mining system using association analysis based on the apriori algorithm for the assessment of heart event related risk factors. The events investigated were: myocardial infarction (MI), percutaneous coronary intervention (PCI), and coronary artery bypass graft surgery (CABG). A total of 369 cases were collected from the Paphos CHD Survey, most of them with more than one event. The most important risk factors, as extracted from the association rule analysis were: sex (male), smoking, high density lipoprotein, glucose, family history, and history of hypertension. Most of these risk factors were also extracted by our group in a previous study using the C4.5 decision tree algorithms, and by other investigators. Further investigation with larger data sets is still needed to verify these findings. PMID:19965088

  9. Collaborative Data Mining Tool for Education

    Science.gov (United States)

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; Gea, Miguel; de Castro, Carlos

    2009-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the continuous improvement of e-learning courses allowing teachers with similar course's profile sharing and scoring the discovered information. This mining tool is oriented to be used by instructors non experts in data mining such that, its…

  10. Discovery of Web Topic-Specific Association Rules%Web主题关联知识自学习算法

    Institute of Scientific and Technical Information of China (English)

    杨沛; 郑启伦; 彭宏

    2003-01-01

    There are hidden and rich information for data mining in the topology of topic-specific websites. A new topic-specific association rules mining algorithm is proposed to further the research on this area. The key idea is to analyze the frequent hyperlinked relati ons between pages of different topics. In the topic-specific area, if pages of onetopic are frequently hyperlinked by pages of another topic, we consider the two topics are relevant. Also, if pages oftwo different topics are frequently hyperlinked together by pages of the other topic, we consider the two topics are relevant.The initial experiments show that this algorithm performs quite well while guiding the topic-specific crawling agent and it can be applied to the further discovery and mining on the topic-specific website.

  11. Application of rule-based data mining techniques to real time ATLAS Grid job monitoring data

    Science.gov (United States)

    Ahrens, R.; Harenberg, T.; Kalinin, S.; Mättig, P.; Sandhoff, M.; dos Santos, T.; Volkmer, F.

    2012-12-01

    The Job Execution Monitor (JEM) is a job-centric grid job monitoring software developed at the University of Wuppertal and integrated into the pilot-based PanDA job brokerage system leveraging physics analysis and Monte Carlo event production for the ATLAS experiment on the Worldwide LHC Computing Grid (WLCG). With JEM, job progress and grid worker node health can be supervised in real time by users, site admins and shift personnel. Imminent error conditions can be detected early and countermeasures can be initiated by the Job's owner immedeatly. Grid site admins can access aggregated data of all monitored jobs to infer the site status and to detect job and Grid worker node misbehavior. Shifters can use the same aggregated data to quickly react to site error conditions and broken production tasks. In this work, the application of novel data-centric rule based methods and data-mining techniques to the real time monitoring data is discussed. The usage of such automatic inference techniques on monitoring data to provide job and site health summary information to users and admins is presented. Finally, the provision of a secure real-time control and steering channel to the job as extension of the presented monitoring software is considered and a possible model of such the control method is presented.

  12. A guided search genetic algorithm using mined rules for optimal affective product design

    Science.gov (United States)

    Fung, Chris K. Y.; Kwong, C. K.; Chan, Kit Yan; Jiang, H.

    2014-08-01

    Affective design is an important aspect of new product development, especially for consumer products, to achieve a competitive edge in the marketplace. It can help companies to develop new products that can better satisfy the emotional needs of customers. However, product designers usually encounter difficulties in determining the optimal settings of the design attributes for affective design. In this article, a novel guided search genetic algorithm (GA) approach is proposed to determine the optimal design attribute settings for affective design. The optimization model formulated based on the proposed approach applied constraints and guided search operators, which were formulated based on mined rules, to guide the GA search and to achieve desirable solutions. A case study on the affective design of mobile phones was conducted to illustrate the proposed approach and validate its effectiveness. Validation tests were conducted, and the results show that the guided search GA approach outperforms the GA approach without the guided search strategy in terms of GA convergence and computational time. In addition, the guided search optimization model is capable of improving GA to generate good solutions for affective design.

  13. Application of rule-based data mining techniques to real time ATLAS Grid job monitoring data

    International Nuclear Information System (INIS)

    The Job Execution Monitor (JEM) is a job-centric grid job monitoring software developed at the University of Wuppertal and integrated into the pilot-based PanDA job brokerage system leveraging physics analysis and Monte Carlo event production for the ATLAS experiment on the Worldwide LHC Computing Grid (WLCG). With JEM, job progress and grid worker node health can be supervised in real time by users, site admins and shift personnel. Imminent error conditions can be detected early and countermeasures can be initiated by the Job's owner immedeatly. Grid site admins can access aggregated data of all monitored jobs to infer the site status and to detect job and Grid worker node misbehavior. Shifters can use the same aggregated data to quickly react to site error conditions and broken production tasks. In this work, the application of novel data-centric rule based methods and data-mining techniques to the real time monitoring data is discussed. The usage of such automatic inference techniques on monitoring data to provide job and site health summary information to users and admins is presented. Finally, the provision of a secure real-time control and steering channel to the job as extension of the presented monitoring software is considered and a possible model of such the control method is presented.

  14. Application of rule-based data mining techniques to real time ATLAS Grid job monitoring data

    CERN Document Server

    Ahrens, R; The ATLAS collaboration; Kalinin, S; Maettig, P; Sandhoff, M; dos Santos, T; Volkmer, F

    2012-01-01

    The Job Execution Monitor (JEM) is a job-centric grid job monitoring software developed at the University of Wuppertal and integrated into the pilot-based “PanDA” job brokerage system leveraging physics analysis and Monte Carlo event production for the ATLAS experiment on the Worldwide LHC Computing Grid (WLCG). With JEM, job progress and grid worker node health can be supervised in real time by users, site admins and shift personnel. Imminent error conditions can be detected early and countermeasures can be initiated by the Job’s owner immideatly. Grid site admins can access aggregated data of all monitored jobs to infer the site status and to detect job and Grid worker node misbehaviour. Shifters can use the same aggregated data to quickly react to site error conditions and broken production tasks. In this work, the application of novel data-centric rule based methods and data-mining techniques to the real time monitoring data is discussed. The usage of such automatic inference techniques on monitorin...

  15. Study on the change rule of groundwater level and its impacts on vegetation at arid mining area

    Institute of Scientific and Technical Information of China (English)

    LEI Shao-gang; BIAN Zheng-fu; ZHANG Ri-chen; LI Lin

    2007-01-01

    The shallow groundwater in Shendong mining area was broken because of large-scale underground mining activities. Selecting 32201 working-face as research area,analyzed the change rule of groundwater level and aquifer thickness under mining impact with a large number of water level observation data. Then, the impacts of groundwater level change on vegetation were analyzed by the relationship theory of arid area groundwater and vegetation. The results show that the aquifer structure and the water condition of supply flow and drainage are changed by the water proof mining. The groundwater level recovere only a little compared with the original groundwater level in two years. But the great change of groundwater level do not have notable influences on vegetation of this mining area, and further study indicates that there are certain conditions where groundwater level change impacted on vegetation. When the influence of groundwater level change was evaluated, the plant ecological water level, warning water level and spatial distribution character of original groundwater and mining-impacted groundwater-level change should be integrated.

  16. Respirable quartz hazard associated with coal mine roof bolter dust

    Energy Technology Data Exchange (ETDEWEB)

    Joy, G.J.; Beck, T.W.; Listak, J.M. [National Inst. for Occupational Safety and Health, Pittsburgh, PQ (United States)

    2010-07-01

    Pneumoconiosis has been reported to be increasing among underground coal miners in the Southern Appalachian Region. The National Institute for Occupational Safety and Health conducted a study to examine the particle size distribution and quartz content of dust generated by the installation of roof bolts in mines. Forty-six bulk samples of roof bolting machine pre-cleaner cyclone dump dust and collector box dust were collected from 26 underground coal mines. Real-time and integrated airborne respirable dust concentrations were measured on 3 mining sections in 2 mines. The real-time airborne dust concentrations profiles were examined to identify any concentration changes that might be associated with pre-cleaner cyclone dust discharge events. The study showed that bolter dust is a potential inhalation hazard due to the fraction of dust less than 10 {mu}m in size, and the quartz content of the dust. The pre-cleaner cyclone dust was significantly larger than the collector box dust, indicating that the pre-cleaner functioned properly in removing the larger dust size fraction from the airstream. However, the pre-cleaner dust still contained a substantial amount of respirable dust. It was concluded that in order to maintain the effectiveness of a roof bolter dust collector, periodic removal of dust is required. Appropriate work procedures and equipment are necessary to minimize exposure during this cleaning task. 13 refs., 3 tabs., 2 figs.

  17. Predicting the risk associated to pregnancy using data mining

    OpenAIRE

    Machado, José Manuel; Abelha, António; Santos, Manuel; Portela, Filipe; Pereira, Eliana; Brandão, Andreia

    2015-01-01

    Woman willing to terminate pregnancy should in general use a specialized health unit, as it is the case of Maternidade Júlio Dinis in Porto, Portugal. One of the four stages comprising the process is evaluation. The purpose of this article is to evaluate the process of Voluntary Termination of Pregnancy and, consequently, identify the risk associated to the patients. Data Mining (DM) models were induced to predict the risk in a real environment. Three different techniques were considered: Dec...

  18. Data mining theories, algorithms, and examples

    CERN Document Server

    Ye, Nong

    2013-01-01

    AN OVERVIEW OF DATA MINING METHODOLOGIESIntroduction to data mining methodologiesMETHODOLOGIES FOR MINING CLASSIFICATION AND PREDICTION PATTERNSRegression modelsBayes classifiersDecision treesMulti-layer feedforward artificial neural networksSupport vector machinesSupervised clusteringMETHODOLOGIES FOR MINING CLUSTERING AND ASSOCIATION PATTERNSHierarchical clusteringPartitional clusteringSelf-organized mapProbability distribution estimationAssociation rulesBayesian networksMETHODOLOGIES FOR MINING DATA REDUCTION PATTERNSPrincipal components analysisMulti-dimensional scalingLatent variable anal

  19. Finding Influential Users in Social Media Using Association Rule Learning

    Directory of Open Access Journals (Sweden)

    Fredrik Erlandsson

    2016-04-01

    Full Text Available Influential users play an important role in online social networks since users tend to have an impact on one other. Therefore, the proposed work analyzes users and their behavior in order to identify influential users and predict user participation. Normally, the success of a social media site is dependent on the activity level of the participating users. For both online social networking sites and individual users, it is of interest to find out if a topic will be interesting or not. In this article, we propose association learning to detect relationships between users. In order to verify the findings, several experiments were executed based on social network analysis, in which the most influential users identified from association rule learning were compared to the results from Degree Centrality and Page Rank Centrality. The results clearly indicate that it is possible to identify the most influential users using association rule learning. In addition, the results also indicate a lower execution time compared to state-of-the-art methods.

  20. Finding Influential Users in Social Media Using Association Rule Learning

    Science.gov (United States)

    Erlandsson, Fredrik; Bródka, Piotr; Borg, Anton; Johnson, Henric

    2016-04-01

    Influential users play an important role in online social networks since users tend to have an impact on one other. Therefore, the proposed work analyzes users and their behavior in order to identify influential users and predict user participation. Normally, the success of a social media site is dependent on the activity level of the participating users. For both online social networking sites and individual users, it is of interest to find out if a topic will be interesting or not. In this article, we propose association learning to detect relationships between users. In order to verify the findings, several experiments were executed based on social network analysis, in which the most influential users identified from association rule learning were compared to the results from Degree Centrality and Page Rank Centrality. The results clearly indicate that it is possible to identify the most influential users using association rule learning. In addition, the results also indicate a lower execution time compared to state-of-the-art methods.

  1. Evaluation of Association Rules Extracted during Anomaly Explanation

    Czech Academy of Sciences Publication Activity Database

    Kopp, Martin; Holeňa, Martin

    Aachen & Charleston: Technical University & CreateSpace Independent Publishing Platform, 2015 - (Yaghob, J.), s. 143-149. (CEUR Workshop Proceedings. V-1422). ISBN 978-1-5151-2065-0. ISSN 1613-0073. [ITAT 2015. Conference on Theory and Practice of Information Technologies /15./. Slovenský Raj (SK), 17.09.2015-21.09.2015] R&D Projects: GA ČR GA13-17187S Institutional support: RVO:67985807 Keywords : anomaly detection * anomaly interpretation * association rules * confidence boost * random forest Subject RIV: IN - Informatics, Computer Science

  2. Advances in rule-based process mining: applications for enterprise risk management and auditing.

    OpenAIRE

    Caron, Filip; Vanthienen, Jan; Baesens, Bart

    2013-01-01

    Process mining research has mainly focused on the development of process mining techniques, with process discovery algorithms in the center of attention. However, far less research attention has been paid to the actual applicability of these process mining techniques in common business settings. Consequently, there only exists a partial fit between the existing process mining techniques and the compliance checking & risk management applications. This research report contributes to the process...

  3. NV - Assessment of wildlife hazards associated with mine pit lakes

    Data.gov (United States)

    US Fish and Wildlife Service, Department of the Interior — Several open pit mines in Nevada lower groundwater to mine ore below the water table. After mining, the pits partially fill with groundwater to form pit lakes....

  4. Population cancer risks associated with coal mining: a systematic review.

    Directory of Open Access Journals (Sweden)

    Wiley D Jenkins

    Full Text Available BACKGROUND: Coal is produced across 25 states and provides 42% of US energy. With production expected to increase 7.6% by 2035, proximate populations remain at risk of exposure to carcinogenic coal products such as silica dust and organic compounds. It is unclear if population exposure is associated with increased risk, or even which cancers have been studied in this regard. METHODS: We performed a systematic review of English-language manuscripts published since 1980 to determine if coal mining exposure was associated with increased cancer risk (incidence and mortality. RESULTS: Of 34 studies identified, 27 studied coal mining as an occupational exposure (coal miner cohort or as a retrospective risk factor but only seven explored health effects in surrounding populations. Overall, risk assessments were reported for 20 cancer site categories, but their results and frequency varied considerably. Incidence and mortality risk assessments were: negative (no increase for 12 sites; positive for 1 site; and discordant for 7 sites (e.g. lung, gastric. However, 10 sites had only a single study reporting incidence risk (4 sites had none, and 11 sites had only a single study reporting mortality risk (2 sites had none. The ecological study data were particularly meager, reporting assessments for only 9 sites. While mortality assessments were reported for each, 6 had only a single report and only 2 sites had reported incidence assessments. CONCLUSIONS: The reported assessments are too meager, and at times contradictory, to make definitive conclusions about population cancer risk due to coal mining. However, the preponderance of this and other data support many of Hill's criteria for causation. The paucity of data regarding population exposure and risk, the widespread geographical extent of coal mining activity, and the continuing importance of coal for US energy, warrant further studies of population exposure and risk.

  5. 基于行为分析的篮球领域规律挖掘的应用研究%Basketball Domain Rule Mining Application Research Based on Behaviour Analysis

    Institute of Scientific and Technical Information of China (English)

    马萌; 于重重; 陈钧; 郭雪

    2014-01-01

    关联规则挖掘过程重视算法研究,忽视商业需求,使挖掘结果很难满足商业目标,提出基于行为分析的篮球领域规律挖掘模型。模型使用行为分析概念建立待挖掘数据库,并且在整个挖掘过程中以领域知识为基准,使用技术兴趣度和商业兴趣度相结合的方法来发现满足规则评价标准的深度规则。以中国男子篮球职业联赛( China Basketball Association, CBA)2012赛季比赛数据为例,进行实验和分析。实验结果表明了基于行为分析的篮球领域规律挖掘模型的有效性和实用性。%To deal with the problems of attaching importance to algorithm research and ignoring the business needs in association rule mining and leading to the mining results that it is difficult to meet business objectives, a mining model based on the analysis of behavior of the basketball domain rule was proposed. We used the concept of behavior analysis to set up mining database. The whole mining process was based on domain knowledge, using the method which combines the degrees of technical interest and commercial interest in order to find the deep rules that meet the evaluation criteria. According to the game stats of season 2012 in CBA, some experiments and analysis were carried out. The results demonstrate the effectiveness and practicality of the mining model based on the analysis of the behav-ior of basketball domain rules.

  6. An artificial immune system for fuzzy-rule induction in data mining

    OpenAIRE

    Alves, Roberto T.; Delgado, Myriam; Lopes, Heitor S.; Freitas, Alex. A.

    2004-01-01

    This work proposes a classification-rule discovery algorithm integrating artificial immune systems and fuzzy systems. The algorithm consists of two parts: a sequential covering procedure and a rule evolution procedure. Each antibody (candidate solution) corresponds to a classification rule. The classification of new examples (antigens) considers not only the fitness of a fuzzy rule based on the entire training set, but also the affinity between the rule and the new example. This affinity must...

  7. A new method for the discovery of the best threshold value for finding positive or negative association rules using Binary Particle Swarm Optimization

    Directory of Open Access Journals (Sweden)

    Abdoljabbar Asadi

    2012-11-01

    Full Text Available In association rule mining most of former researches have worked on analytic optimizing method , but finding and specifying the advocate initiation limit influences on association rule mining's quality , which still is important hence this research wants to present a new algorithm for optimizing the analytic efficiency improvement including automatic analyze proper amount for initiation. Through former method this task had been performing based on positive rules but regarding that finding the negative ones were though for administrator, this research's privilege is that the initiation level automatically is analyzed for the first time; also it has high efficiency in large data base. Particle Swarm Optimization is observed for any particle's efficiency and as data turned in binary the advocate amount will be found. Results showed Particle Swarm Optimization could present better initiation level, and enhance the former algorithm's result a lot. Consequence will be comparing with Weka and Apriori.

  8. Mining multi-item drug adverse effect associations in spontaneous reporting systems

    Directory of Open Access Journals (Sweden)

    Chase Herbert S

    2010-10-01

    Full Text Available Abstract Background Multi-item adverse drug event (ADE associations are associations relating multiple drugs to possibly multiple adverse events. The current standard in pharmacovigilance is bivariate association analysis, where each single drug-adverse effect combination is studied separately. The importance and difficulty in the detection of multi-item ADE associations was noted in several prominent pharmacovigilance studies. In this paper we examine the application of a well established data mining method known as association rule mining, which we tailored to the above problem, and demonstrate its value. The method was applied to the FDAs spontaneous adverse event reporting system (AERS with minimal restrictions and expectations on its output, an experiment that has not been previously done on the scale and generality proposed in this work. Results Based on a set of 162,744 reports of suspected ADEs reported to AERS and published in the year 2008, our method identified 1167 multi-item ADE associations. A taxonomy that characterizes the associations was developed based on a representative sample. A significant number (67% of the total of potential multi-item ADE associations identified were characterized and clinically validated by a domain expert as previously recognized ADE associations. Several potentially novel ADEs were also identified. A smaller proportion (4% of associations were characterized and validated as known drug-drug interactions. Conclusions Our findings demonstrate that multi-item ADEs are present and can be extracted from the FDA’s adverse effect reporting system using our methodology, suggesting that our method is a valid approach for the initial identification of multi-item ADEs. The study also revealed several limitations and challenges that can be attributed to both the method and quality of data.

  9. The spatiotempora variations rules of Songzao coal mining subsidence based on numerical simulation

    OpenAIRE

    Lu, J.; Li, Y.; Cheng, H.; Tang, Z.(University of Science & Technology of China, Hefei, 230026, China)

    2015-01-01

    With the increasing demand of coal, coal mining at Songzao makes the area of land subsidence growing larger. Land subsidence in coal mining area not only made large subsided farmland out of production and caused the enormous loss to local agricultural production, but also brought a number of serious problems to the local social economy and ecology Environment. To use Probability-integral Method based on numerical simulation of Songzao Mine, its subsidence simulation data fro...

  10. An Efficient Association Rules Algorithm Based on Compressed Matrix

    Directory of Open Access Journals (Sweden)

    Zhiyong Wang

    2013-10-01

    Full Text Available This paper analyses the classic Apriori algorithm as well as some disadvantages of the improved algorithms, based on which the paper improves the Boolean matrix. A row and a column are added on the former Boolean matrix to store the row vector of weight and account of the column vector. According to the quality of Apriori algorithm, Boolean matrix is largely compressed, which greatly reduces the complexity of space. At the same time, we adopt the method of weighting vector inner-product to find frequent K-itemsets so as to get the association rules. The complexity of space and time is developed to a large extent by the improved algorithm. In the end, the paper gives the computing procedure of the improved algorithm and by experiments, it proves that the algorithm is effective.  

  11. An improved predictive association rule based classifier using gain ratio and T-test for health care data diagnosis

    Indian Academy of Sciences (India)

    M Nandhini; S N Sivanandam

    2015-09-01

    Health care data diagnosis is a significant task that needs to be executed precisely, which requires much experience and domain-knowledge. Traditional symptoms-based disease diagnosis may perhaps lead to false presumptions. In recent times, Associative Classification (AC), the combination of association rule mining and classification has received attention in health care applications which desires maximum accuracy. Though several AC techniques exist, they lack in generating quality rules for building efficient associative classifier. This paper aims to enhance the accuracy of the existing CPAR (Classification based on Predictive Association Rule) algorithm by generating quality rules using Gain Ratio. Mostly, health care applications deal with high dimensional datasets. Existence of high dimensions causes unfair estimates in disease diagnosis. Dimensionality reduction is commonly applied as a preprocessing step before classification task to improve classifier accuracy. It eliminates redundant and insignificant dimensions by keeping good ones without information loss. In this work, dimensionality reductions by T-test and reduct sets (or simply reducts) are performed as preprocessing step before CPAR and CPAR using Gain Ratio (CPAR-GR) algorithms. An investigation was also performed to determine the impact of T-test and reducts on CPAR and CPAR-GR. This paper synthesizes the existing work carried out in AC, and also discusses the factors that influence the performance of CPAR and CPAR-GR. Experiments were conducted using six health care datasets from UCI machine learning repository. Based on the experiments, CPAR-GR with T-test yields better classification accuracy than CPAR.

  12. Generation of Acid Mine Lakes Associated with Abandoned Coal Mines in Northwest Turkey.

    Science.gov (United States)

    Sanliyuksel Yucel, Deniz; Balci, Nurgul; Baba, Alper

    2016-05-01

    A total of five acid mine lakes (AMLs) located in northwest Turkey were investigated using combined isotope, molecular, and geochemical techniques to identify geochemical processes controlling and promoting acid formation. All of the investigated lakes showed typical characteristics of an AML with low pH (2.59-3.79) and high electrical conductivity values (1040-6430 μS/cm), in addition to high sulfate (594-5370 mg/l) and metal (aluminum [Al], iron [Fe], manganese [Mn], nickel [Ni], and zinc [Zn]) concentrations. Geochemical and isotope results showed that the acid-generation mechanism and source of sulfate in the lakes can change and depends on the age of the lakes. In the relatively older lakes (AMLs 1 through 3), biogeochemical Fe cycles seem to be the dominant process controlling metal concentration and pH of the water unlike in the younger lakes (AMLs 4 and 5). Bacterial species determined in an older lake (AML 2) indicate that biological oxidation and reduction of Fe and S are the dominant processes in the lakes. Furthermore, O and S isotopes of sulfate indicate that sulfate in the older mine lakes may be a product of much more complex oxidation/dissolution reactions. However, the major source of sulfate in the younger mine lakes is in situ pyrite oxidation catalyzed by Fe(III) produced by way of oxidation of Fe(II). Consistent with this, insignificant fractionation between δ(34) [Formula: see text] and δ(34) [Formula: see text] values indicated that the oxidation of pyrite, along with dissolution and precipitation reactions of Fe(III) minerals, is the main reason for acid formation in the region. Overall, the results showed that acid generation during early stage formation of an AML associated with pyrite-rich mine waste is primarily controlled by the oxidation of pyrite with Fe cycles becoming the dominant processes regulating pH and metal cycles in the later stages of mine lake development. PMID:26987541

  13. Associations between rule-based parenting practices and child screen viewing: A cross-sectional study

    Directory of Open Access Journals (Sweden)

    Joanna M. Kesten

    2015-01-01

    Conclusions: Limit setting is associated with greater SV. Collaborative rule setting may be effective for managing boys' game-console use. More research is needed to understand rule-based parenting practices.

  14. 3D reconstruction method and connectivity rules of fracture networks generated under different mining layouts

    Institute of Scientific and Technical Information of China (English)

    Zhang Ru; Ai Ting; Li Hegui; Zhang Zetian; Liu Jianfeng

    2013-01-01

    In current research, a series of triaxial tests, which were employed to simulate three typical mining lay-outs (i.e., top-coal caving, non-pillar mining and protected coal seam mining), were conducted on coal by using MTS815 Flex Test GT rock mechanics test system, and the fracture networks in the broken coal samples were qualitatively and quantitatively investigated by employing CT scanning and 3D reconstruc-tion techniques. This work aimed at providing a detail description on the micro-structure and fracture-connectivity characteristics of rupture coal samples under different mining layouts. The results show that: (i) for protected coal seam mining layout, the coal specimens failure is in a compression-shear manner and oppositely, (ii) the tension-shear failure phenomenon is observed for top-coal caving and non-pillar mining layouts. By investigating the connectivity features of the generated fractures in the direction of r1 under different mining layouts, it is found that the connectivity level of the fractures of the samples corresponding to non-pillar mining layout was the highest.

  15. Data mining algorithm for discovering matrix association regions (MARs)

    Science.gov (United States)

    Singh, Gautam B.; Krawetz, Shephan A.

    2000-04-01

    Lately, there has been considerable interest in applying Data Mining techniques to scientific and data analysis problems in bioinformatics. Data mining research is being fueled by novel application areas that are helping the development of newer applied algorithms in the field of bioinformatics, an emerging discipline representing the integration of biological and information sciences. This is a shift in paradigm from the earlier and the continuing data mining efforts in marketing research and support for business intelligence. The problem described in this paper is along a new dimension in DNA sequence analysis research and supplements the previously studied stochastic models for evolution and variability. The discovery of novel patterns from genetic databases as described is quite significant because biological patterns play an important role in a large variety of cellular processes and constitute the basis for gene therapy. Biological databases containing the genetic codes from a wide variety of organisms, including humans, have continued their exponential growth over the last decade. At the time of this writing, the GenBank database contains over 300 million sequences and over 2.5 billion characters of sequenced nucleotides. The focus of this paper is on developing a general data mining algorithm for discovering regions of locus control, i.e. those regions that are instrumental for determining cell type. One such type of element of locus control are the MARs or the Matrix Association Regions. Our limited knowledge about MARs has hampered their detection using classical pattern recognition techniques. Consequently, their detection is formulated by utilizing a statistical interestingness measure derived from a set of empirical features that are known to be associated with MARs. This paper presents a systematic approach for finding associations between such empirical features in genomic sequences, and for utilizing this knowledge to detect biologically interesting

  16. Feed Forward Neural Network Algorithm for Frequent Patterns Mining

    OpenAIRE

    Dr. K.R.Pardasani; Sanjay Sharma; Amit Bhagat

    2010-01-01

    Association rule mining is used to find relationships among items in large data sets. Frequent patterns mining is an important aspect in association rule mining. In this paper, an efficient algorithm named Apriori-Feed Forward(AFF) based on Apriori algorithm and the Feed Forward Neural Network is presented to mine frequent patterns. Apriori algorithm scans database many times to generate frequent itemsets whereas Apriori-Feed Forward(AFF) algorithm scans database Only Once. Computational resu...

  17. Quantum Privacy-Preserving Data Mining

    OpenAIRE

    Ying, Shenggang; Ying, Mingsheng; Feng, Yuan

    2015-01-01

    Data mining is a key technology in big data analytics and it can discover understandable knowledge (patterns) hidden in large data sets. Association rule is one of the most useful knowledge patterns, and a large number of algorithms have been developed in the data mining literature to generate association rules corresponding to different problems and situations. Privacy becomes a vital issue when data mining is used to sensitive data sets like medical records, commercial data sets and nationa...

  18. The nature of waste associated with closed mines in England and Wales

    OpenAIRE

    B. Palumbo-Roe; Colman, T.

    2010-01-01

    This report has been prepared for the Environment Agency (EA) to provide information on mineral waste associated with closed mining and quarrying sites in England and Wales as part of the provisions of the EU Mine Waste Directive 2006 (MWD). The Environment Agency is the regulatory body for England and Wales (E&W) responsible for producing an inventory of closed mining waste facilities, including abandoned waste facilities, as required by Article 20 of the European Mine Wastes Directive, by M...

  19. Preliminary study on regulatory limits of coal mines associated with radionuclides in Xinjiang

    International Nuclear Information System (INIS)

    In this paper, limits of radon concentration and gamma radiation dose rate of coal mines associated with radionuclide in Xinjiang were studied to provide theoretical bases in developing scientific and practical regulatory standards of radiation protection for coal mines associated with radionuclides. It is meaningful in strengthening the supervision to coal mines associated with radionuclides, boosting their development of exploitation and utilization, as well as the protection to the health of worker and public and the environment. It may also provide references in defining the limits of regulatory standards for NORMs associated mining and processing of ore resources. (authors)

  20. Associations between rule-based parenting practices and child screen viewing: A cross-sectional study

    Science.gov (United States)

    Kesten, Joanna M.; Sebire, Simon J.; Turner, Katrina M.; Stewart-Brown, Sarah; Bentley, Georgina; Jago, Russell

    2015-01-01

    Background Child screen viewing (SV) is positively associated with poor health indicators. Interventions addressing rule-based parenting practices may offer an effective means of limiting SV. This study examined associations between rule-based parenting practices (limit and collaborative rule setting) and SV in 6–8-year old children. Methods An online survey of 735 mothers in 2011 assessed: time that children spent engaged in SV activities; and the use of limit and collaborative rule setting. Logistic regression was used to examine the extent to which limit and collaborative rule setting were associated with SV behaviours. Results ‘Always’ setting limits was associated with more TV viewing, computer, smartphone and game-console use and a positive association was found between ‘always’ setting limits for game-console use and multi-SV (in girls). Associations were stronger in mothers of girls compared to mothers of boys. ‘Sometimes’ setting limits was associated with more TV viewing. There was no association between ‘sometimes’ setting limits and computer, game-console or smartphone use. There was a negative association between collaborative rule setting and game-console use in boys. Conclusions Limit setting is associated with greater SV. Collaborative rule setting may be effective for managing boys' game-console use. More research is needed to understand rule-based parenting practices. PMID:26844054

  1. EFFICIENT DISPATCHING RULES BASED ON DATA MINING FOR THE SINGLE MACHINE SCHEDULING PROBLEM

    Directory of Open Access Journals (Sweden)

    Mohamed Habib Zahmani

    2015-11-01

    Full Text Available In manufacturing the solutions found for scheduling problems and the human expert’s experience are very important. They can be transformed using Artificial Intelligence techniques into knowledge and this knowledge could be used to solve new scheduling problems. In this paper we use Decision Trees for the generation of new Dispatching Rules for a Single Machine shop solved using a Genetic Algorithm. Two heuristics are proposed to use the new Dispatching Rules and a comparative study with other Dispatching Rules from the literature is presented.

  2. Mining Rules from Crisp Attributes by Rough Sets on the Fuzzy Class Sets

    OpenAIRE

    Mojtaba MadadyarAdeh; Dariush Dashchi Rezaee; Ali Soultanmohammadi

    2012-01-01

    Machine learning can extract desired knowledge and ease the development bottleneck in building expert systems. Among the proposed approaches, deriving classification rules from training examples is the most common. Given a set of examples, a learning program tries to induce rules that describe each class. The rough-set theory has served as a good mathematical tool for dealing with data classification problems. In the past, the rough-set theory was widely used in dealing with data classificati...

  3. EFFICIENT DISPATCHING RULES BASED ON DATA MINING FOR THE SINGLE MACHINE SCHEDULING PROBLEM

    OpenAIRE

    Mohamed Habib Zahmani; Baghdad Atmani; Abdelghani Bekrar

    2015-01-01

    In manufacturing the solutions found for scheduling problems and the human expert’s experience are very important. They can be transformed using Artificial Intelligence techniques into knowledge and this knowledge could be used to solve new scheduling problems. In this paper we use Decision Trees for the generation of new Dispatching Rules for a Single Machine shop solved using a Genetic Algorithm. Two heuristics are proposed to use the new Dispatching Rules and a comparative s...

  4. Modified Entropy Measure for Detection of Association Rules Under Simpson's Paradox Context

    OpenAIRE

    Choy, Murphy; Ong, Cally Claire; Cheong, Michelle

    2012-01-01

    The rapid explosion in retail data calls for more effective and efficient discovery of association rules to develop relevant business strategies and rules.Unlike online shopping sites, most brick and mortar retail shops are located in geographically and demographically diverse areas. This diversity presents a new challenge to the classical association rule model which assumes a homogenous group of customers behaving differently. The focus of this paper is centered on the discovery of associat...

  5. Data Mining Rules for Ultrasonic B-Type Detection and Diagnosis for Cholecystolithiasis

    Institute of Scientific and Technical Information of China (English)

    LOUWei; YANLi-min; HEGuo-sen

    2004-01-01

    This paper presents realistic data mining based on the data of B-type ultrasonic detection and diagnosis for cholrcystolithiasis (gallbladder stone in biliary tract) recorded by a district central hospital in Shanghai during the past several years. Computer simulation and modeling is described.

  6. The study of slip line field and upper bound method based on associated flow and non-associated flow rules

    Institute of Scientific and Technical Information of China (English)

    Zheng Yingren; Deng Chujian; Wang Jinglin

    2010-01-01

    At present,associated flow rule of traditional plastic theory is adopted in the slip line field theory and upper bound method of geotechnical materials.So the stress characteristic line conforms to the velocity line.It is proved that geotechnical materials do not abide by the associated flow rule.It is impossible for the stress characteristic line to conform to the velocity line.Generalized plastic mechanics theoretically proved that plastic potential surface intersects the Mohr-Coulomb yield surface with an angle,so that the velocity line must be studied by non-associated flow rule.According to limit analysis theory,the theory of slip line field is put forward in this paper,and then the ultimate boating capacity of strip footing is obtained based on the associated flow rule and the non-associated flow rule individually.These two results are identical since the ultimate bearing capacity is independent of flow rule.On the contrary,the velocity fields of associated and non-associated flow rules are different which shows the velocity field based on the associated flow rule is incorrect.

  7. Mine subsidence control projects associated with solid waste disposal facilities

    International Nuclear Information System (INIS)

    Pennsylvania environmental regulations require applicant's for solid waste disposal permits to provide information regarding the extent of deep mining under the proposed site, evaluations of the maximum subsidence potential, and designs of measures to mitigate potential subsidence impact on the facility. This paper presents three case histories of deep mine subsidence control projects at solid waste disposal facilities. Each case history presents site specific mine grouting project data summaries which include evaluations of the subsurface conditions from drilling, mine void volume calculations, grout mix designs, grouting procedures and techniques, as well as grout coverage and extent of mine void filling evaluations. The case studies described utilized basic gravity grouting techniques to fill the mine voids and fractured strata over the collapsed portions of the deep mines. Grout mixtures were designed to achieve compressive strengths suitable for preventing future mine subsidence while maintaining high flow characteristics to penetrate fractured strata. Verification drilling and coring was performed in the grouted areas to determine the extent of grout coverage and obtain samples of the in-place grout for compression testing. The case histories presented in this report demonstrate an efficient and cost effective technique for mine subsidence control projects

  8. Design and Realization of user Behaviors Recommendation System Based on Association rules under Cloud Environment

    Directory of Open Access Journals (Sweden)

    Wei Dai

    2013-07-01

    Full Text Available This study introduces the basal principles of association rules, properties and advantages of Map Reduce model and Hbase in Hadoop ecosystem. And giving design steps of the user's actions recommend system in detail, many time experiences proves that the exploration combined association rules theory with cloud computing is successful and effective.

  9. Using Association Rules for Better Treatment of Missing Values

    OpenAIRE

    Bashir, Shariq; Razzaq, Saad; Maqbool, Umer; Tahir, Sonya; Baig, Abdul Rauf

    2009-01-01

    The quality of training data for knowledge discovery in databases (KDD) and data mining depends upon many factors, but handling missing values is considered to be a crucial factor in overall data quality. Today real world datasets contains missing values due to human, operational error, hardware malfunctioning and many other factors. The quality of knowledge extracted, learning and decision problems depend directly upon the quality of training data. By considering the importance of handling m...

  10. An Efficient Method for Mining Event-Related Potential Patterns

    CERN Document Server

    Mousavi, Seyed Aliakbar; Mohamed, Hasimah Hj; Alomari, Saleh Ali

    2012-01-01

    In the present paper, we propose a Neuroelectromagnetic Ontology Framework (NOF) for mining Event-related Potentials (ERP) patterns as well as the process. The aim for this research is to develop an infrastructure for mining, analysis and sharing the ERP domain ontologies. The outcome of this research is a Neuroelectromagnetic knowledge-based system. The framework has 5 stages: 1) Data pre-processing and preparation; 2) Data mining application; 3) Rule Comparison and Evaluation; 4) Association rules Post-processing 5) Domain Ontologies. In 5th stage a new set of hidden rules can be discovered base on comparing association rules by domain ontologies and expert rules.

  11. An Efficient Method for Mining Event-Related Potential Patterns

    Directory of Open Access Journals (Sweden)

    Seyed Aliakbar Mousavi

    2011-11-01

    Full Text Available In the present paper, we propose a Neuroelectromagnetic Ontology Framework (NOF for mining Event-related Potentials (ERP patterns as well as the process. The aim for this research is to develop an infrastructure for mining, analysis and sharing the ERP domain ontologies. The outcome of this research is a Neuroelectromagnetic knowledge-based system. The framework has 5 stages: 1 Data pre-processing and preparation; 2 Data mining application; 3 Rule Comparison and Evaluation; 4 Association rules Post-processing 5 Domain Ontologies. In 5th stage a new set of hidden rules can be discovered base on comparing association rules by domain ontologies and expert rules.

  12. Fusion: a Visualization Framework for Interactive Ilp Rule Mining With Applications to Bioinformatics

    OpenAIRE

    Indukuri, Kiran Kumar

    2004-01-01

    Microarrays provide biologists an opportunity to find the expression profiles of thousands of genes simultaneously. Biologists try to understand the mechanisms underlying the life processes by finding out relationships between gene-expression and their functional categories. Fusion is a software system that aids the biologists in performing microarray data analysis by providing them with both visual data exploration and data mining capabilities. Its multiple view visual framework allows the u...

  13. Text Association Analysis and Ambiguity in Text Mining

    Science.gov (United States)

    Bhonde, S. B.; Paikrao, R. L.; Rahane, K. U.

    2010-11-01

    Text Mining is the process of analyzing a semantically rich document or set of documents to understand the content and meaning of the information they contain. The research in Text Mining will enhance human's ability to process massive quantities of information, and it has high commercial values. Firstly, the paper discusses the introduction of TM its definition and then gives an overview of the process of text mining and the applications. Up to now, not much research in text mining especially in concept/entity extraction has focused on the ambiguity problem. This paper addresses ambiguity issues in natural language texts, and presents a new technique for resolving ambiguity problem in extracting concept/entity from texts. In the end, it shows the importance of TM in knowledge discovery and highlights the up-coming challenges of document mining and the opportunities it offers.

  14. Consideraciones generales del web mining

    OpenAIRE

    Castiblanco Calderón, Ellizabeth; Leal, Sandra Milena

    2007-01-01

    This article describes the basic concepts for the use of Webmining, within which are techniques (Rules of Association, sequential patterns, sorting and clustering) and areas (web content mining, mining structure and mining use Web site) of greater use for the information discovery. It also aims to expose the importance of the use of this technique within units of information, bearing in mind that this works from the users' needs, essential to the operation.

  15. Mining

    Directory of Open Access Journals (Sweden)

    Khairullah Khan

    2014-09-01

    Full Text Available Opinion mining is an interesting area of research because of its applications in various fields. Collecting opinions of people about products and about social and political events and problems through the Web is becoming increasingly popular every day. The opinions of users are helpful for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, Web blogs and social networks. Because of the huge number of reviews in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the reviews from corpuses and Web documents. This study presents a systematic literature survey regarding the computational techniques, models and algorithms for mining opinion components from unstructured reviews.

  16. URANIUM MINING AND ASSOCIATED ENVIRONMENTAL PROBLEMS IN UKRAINE

    OpenAIRE

    Dudar, T.; Zakrytnyi, Ye.; Bugera, M.

    2015-01-01

    Nuclear power demands in uranium resources are expected to increase in the nearest future. So, the problem of uranium mining impact into the environment is a challenge and requires straightway actions.   The tendencies in uranium mining in the world and in Ukraine for the period of 2003-2013 are considered in this paper. It is especially noted the increase in uranium raw material demands and as a consequence in its mining. The available and potential uranium resources are overviewed. It shoul...

  17. Monitoring of radiation hygienic situation in the area of the Argun production mining and chemical association

    Directory of Open Access Journals (Sweden)

    Shandala N.K.

    2013-12-01

    Full Text Available The Argun Production Mining and Chemical Association is a multi-activity mining company which perfprms mining of uranium ore, carries out refining of such ores in hydrometallurgical process to produce natural uranium oxide. In order to establish the strategy and develop criteria for the site remediation, independent radiation hygienic monitoring is being carried out over some years. The researches performed showed that there is a significant excess of 226Ra and 232Th content compared to areas outside the zone of influence of uranium mining.

  18. Catheter-associated Urinary Tract Infection and the Medicare Rule Changes

    OpenAIRE

    Saint, Sanjay; Meddings, Jennifer A.; Calfee, David; Kowalski, Christine P.; Krein, Sarah L.

    2009-01-01

    Catheter-associated urinary tract infection, a common and potentially preventable complication of hospitalization, is one of the hospital-acquired complications chosen by the Centers for Medicare and Medicaid Services (CMS) for which hospitals no longer receive additional payment. To help understand the potential consequences of the recent CMS rule changes we examine the preventability of catheter-associated infection, review the CMS rules changes regarding catheter-associated urinary tract i...

  19. A Framework for Personal Mobile commerce Pattern Mining and Prediction

    OpenAIRE

    Monali Nayakr 1, Mr Kolla Bhanu Prakash

    2013-01-01

    Information plays a major role in any organization. We suggest a novel way of acquiring more information from corporate data mining without the complications and drawbacks of deploying additional software systems. Association-rule mining, which captures co-occurrence patterns within data, has attracted considerable efforts from data mining researchers and practitioners alike. Unfortunately, most data mining tools are loosely coupled, at best, with the data mining repository. Furthermore, thes...

  20. On Addressing Efficiency Concerns in Privacy Preserving Data Mining

    OpenAIRE

    Agrawal, Shipra; Krishnan, Vijay; Haritsa, Jayant

    2003-01-01

    Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To encourage users to provide correct inputs, we recently proposed a data distortion scheme for association rule mining that simultaneously provides both privacy to the user and accuracy in the mining results. However, mining the distorted database can be orders of magnitude more time-consuming as compared to mining the original databas...

  1. Disease Prediction in Data Mining Technique – A Survey

    OpenAIRE

    Sudha, S

    2013-01-01

    Data mining is defined as sifting through very large amounts of data for useful information. Some of the most important and popular data mining techniques are association rules, classification, clustering, prediction and sequential patterns. Data mining techniques are used for variety of applications. In health care industry, data mining plays an important role for predicting diseases. For detecting a disease number of tests should be required from the patient. But using data mining technique...

  2. RST Approach for Efficient CARs Mining

    Directory of Open Access Journals (Sweden)

    Thabet Slimani

    2014-11-01

    Full Text Available In data mining, an association rule is a pattern that states the occurrence of two items (premises and consequences together with certain probability. A class association rule set (CARs is a subset of association rules with classes specified as their consequences. This paper focuses on class association rules mining based on the approach of Rough Set Theory (RST. In addition, this paper presents an algorithm for finest class rule set mining inspired from Apriori algorithm, where the support and confidence are computed based on the elementary set of lower approximation inspired from RST. The proposed approach has been shown very effective, where the rough set approach for class association discovery is much simpler than the classic association method

  3. Clinic-Genomic Association Mining for Colorectal Cancer Using Publicly Available Datasets

    OpenAIRE

    Fang Liu; Yaning Feng; Zhenye Li; Chao Pan; Yuncong Su; Rui Yang; Liying Song; Huilong Duan; Ning Deng

    2014-01-01

    In recent years, a growing number of researchers began to focus on how to establish associations between clinical and genomic data. However, up to now, there is lack of research mining clinic-genomic associations by comprehensively analysing available gene expression data for a single disease. Colorectal cancer is one of the malignant tumours. A number of genetic syndromes have been proven to be associated with colorectal cancer. This paper presents our research on mining clinic-genomic assoc...

  4. Fast Vertical Mining Using Boolean Algebra

    Directory of Open Access Journals (Sweden)

    Hosny M. Ibrahim

    2015-01-01

    Full Text Available The vertical association rules mining algorithm is an efficient mining method, which makes use of support sets of frequent itemsets to calculate the support of candidate itemsets. It overcomes the disadvantage of scanning database many times like Apriori algorithm. In vertical mining, frequent itemsets can be represented as a set of bit vectors in memory, which enables for fast computation. The sizes of bit vectors for itemsets are the main space expense of the algorithm that restricts its expansibility. Therefore, in this paper, a proposed algorithm that compresses the bit vectors of frequent itemsets will be presented. The new bit vector schema presented here depends on Boolean algebra rules to compute the intersection of two compressed bit vectors without making any costly decompression operation. The experimental results show that the proposed algorithm, Vertical Boolean Mining (VBM algorithm is better than both Apriori algorithm and the classical vertical association rule mining algorithm in the mining time and the memory usage.

  5. E-commerce Website Recommender System Based on Dissimilarity and Association Rule

    OpenAIRE

    MingWang Zhang; ShuWen Yang; LiFeng Zhang

    2013-01-01

    By analyzing the current electronic commerce recommendation algorithm analysis, put forward a kind to use dissimilarity clustering and association recommendation algorithm, the algorithm realized web website shopping user data clustering by use of the dissimilarity, and then use the association rules algorithm for clustering results of association recommendation, experiments show that the algorithm compared with traditional clustering association algorithm of iteration times decrease, improve...

  6. A study of trends in occupational risks associated with coal mining

    International Nuclear Information System (INIS)

    The coal industry is well known as a major source of specific types of risk and harmful effects including, for instance, harm to the environment, pollution from various surface installations and hazards associated with the actual task of mining. We shall confine our attention to the third group and discuss only the occupational risks facing miners and ex-miners. Unlike the nuclear and oil industries, coal-mines employ very large work-forces, and the risks associated with mining therefore have a considerable impact. Mining is also a highly integrated industry: a mine's own work-force carries out all the underground engineering work (preparatory excavations, installation work, etc.) as well as maintenance. In this narrow field, a distinction should immediately be drawn between two main areas: industrial accidents; and occupational diseases, which include silicosis or, more precisely, coal-miner's pneumoconiosis

  7. Determination of possible radiation hazards associated with tin mining industry in West Malaysia

    International Nuclear Information System (INIS)

    A study was made in Malaysia under an IAEA research contract on the possible radiation hazards associated with tin mining industry in Malaysia. The study comprised of the measurement of external radiation levels in various mines, gamma-ray spectrometric analysis of various samples from mines, and measurements of radon and radon daughters concentrations. For radon daughters modified Tsivoglou and Kusnetz methods were used. The study showed that there is, in general, no radiation hazard associated with the tin mining industry in West Malaysia. However, the only likely source that might pose some external radiation hazard is the amang upgrading plant which invariably concentrates either or both 232Th and 238U in the final products of the upgrading process. A quantitative and thorough investigation of radiation levels in the amang upgrading industry is necessary to determine the degree of hazard. No significant radon or radon daughters concentrations were noted in the underground mines

  8. 数据挖掘发展研究%The Develepment Research on the Data Mining

    Institute of Scientific and Technical Information of China (English)

    张伟; 刘勇国; 彭军; 廖晓峰; 吴中福

    2001-01-01

    Mining knowledge from database has been thought as a key research issue in database system. Great mterest has been paid in data mining by researchers in different fields. In this paper,data mining techniques are introduced broadly including its definition,purpose,characteristic, principal processes and classifications. As an example,the studies on the mining association rules are illustrated. At last,some data mining prototypes are provided and several research trends on the data mining are discussed.

  9. Application of Multidimensional Association Rules Method in Psychological Measurement%多维关联规则在心理测量中的应用

    Institute of Scientific and Technical Information of China (English)

    王冬燕

    2015-01-01

    利用多维关联规则方法提取心理测量不同量表属性间的关联规则,样本包括1958名大学新生。鉴于量表属性较多,且数据库庞大,传统的关联规则Apriori算法较难实现,因此基于Apriori算法设计并实现了多维关联规则的挖掘算法,并应用于心理测量量表属性的关系研究。实验表明,多维关联规则方法能够较快速且更加准确地挖掘出属性间的多维关联规则,并且这些规则在心理测量工作中能够起到指导作用,说明该方法是十分有效的。%The use of multidimensional association rules to extract the psychometric properties of the scale between different association rules, the sample includes 1 958 freshmen.Given the large scale property and huge databases, traditional Apri-ori algorithm of association rules difficult to achieve, so based on Apriori algorithm design and implementation of multidi-mensional association rules mining algorithm, and study the relationship between psychometric properties of the scales ap-plied.Experimental results show that the multidimensional association rules can more quickly and more accurately excava-ted multidimensional association rules between attributes, and these rules work in psycho-metrics can play a guiding role, indicating that this method is very effective.

  10. Service Composition Design Pattern for Autonomic Computing Systems Using Association Rule Based Learning and Service-Oriented Architecture

    Directory of Open Access Journals (Sweden)

    Vishnuvardhan Mannava

    2012-10-01

    Full Text Available In this paper we will compose the design patterns which will satisfy the properties of autonomic computingsystem: for the Decision-Making phase we will introduce Case-Based Reasoning design pattern, and forReconfiguration phase we will introduce Reactor design pattern. The most important proposal in ourcomposite design pattern is that we will use the Association Rule Learning method of Data Mining to learnabout new services that can be added along with the requested service to make the service as a dynamiccomposition of two or more services. Then we will include the new service as an aspectual feature modulecode without interrupting the user.As far as we know, there are no studies on composition of designpatterns and pattern languages for autonomic computing domain. We will authenticate our work by asimple case study work. A simple Class and Sequence diagrams are depicted.

  11. MOCANAR: A MULTI-OBJECTIVE CUCKOO SEARCH ALGORITHM FOR NUMERIC ASSOCIATION RULE DISCOVERY

    OpenAIRE

    Irene Kahvazadeh; Mohammad Saniee Abadeh

    2015-01-01

    Extracting association rules from numeric features involves searching a very large search space. To deal with this problem, in this paper a meta-heuristic algorithm is used that we have called MOCANAR. The MOCANAR is a Pareto based multi-objective cuckoo search algorithm which extracts high quality association rules from numeric datasets. The support, confidence, interestingness and comprehensibility are the objectives that have been considered in the MOCANAR. The MOCANAR extra...

  12. [Analysis on medication rules of state medical master yan zhenghua's prescriptions that including Polygoni Multiflori Caulis based on data mining].

    Science.gov (United States)

    Wu, Jia-rui; Guo, Wei-xian; Zhang, Xiao-meng; Yang, Bing; Zhang, Bing; Zhao, Meng-di; Sheng, Xiao-guang

    2014-11-01

    The prescriptions including Polygoni Multiflori Caulis that built by Pro. Yan were collected to build a database based on traditional Chinese medicine (TCM) inheritance assist system. The method of association rules with apriori algorithm was used to achieve frequency of single medicine, frequency of drug combinations, association rules between drugs and core drug combinations. The datamining results indicated that in the prescriptions that including Polygoni Multiflori Caulis, the highest frequency used drugs were parched Ziziphi Spinosae Semen, Ostreae Concha, Ossis Mastodi Fossilia, Salviae Miltiorrhizae Radix Et Rhizoma, Paeoniae Rubra Radix, and so on. The most frequent drug combinations were "Polygoni Multiflori Caulis-parched Ziziphi Spinosae Semen", "Ostreae Concha-Polygoni Multiflori Caulis", and "Polygoni Multiflori Caulis-Ossis Mastodi Fossilia". The drug association rules of confidence coefficient 1 were "Ostreae Concha-->Polygoni Multiflori Caulis", "Poria-->Polygoni Multiflori Caulis", "parched Ziziphi Spinosae Semen-->Polygoni Multiflori Caulis", and "Paeoniae Alba Radix-->Polygoni Multiflori Caulis". The core drug combinations in the treatment of insomnia were Ossis Mastodi Fossilia, Polygoni Multiflori Caulis, Salviae Miltiorrhizae Radix et Rhizoma, Ostreae Concha, Polygalae Radix, Margaritifera Concha, Poria, and parched Ziziphi Spinosae Semen. And the core drug combinations in the treatment of obstruction of Qi in chest were Salviae Miltiorrhizae Radix Et Rhizoma, Polygoni Multiflori Caulis, parched Ziziphi Spinosae Semen, Trichosanthis Fructus, Allii Macrostemonis Bulbus, and Paeoniae Rubra Radix. PMID:25850286

  13. 基于MDPI的多维关联规则算法的研究%The Research for Multidimensional Association Rules Algorithm Based on MDPI

    Institute of Scientific and Technical Information of China (English)

    彭硕; 吴昊

    2011-01-01

    Multidimensional data mining association rules is an important research direction. In this paper, we propose an efficient algorithm for mining multidimensial association rules,which combine data cube technique with FP-Growth efficiently by constructing a MDPI-tree,the algorithm can explores both inter-dimension and hybrid-dimension association rules. Lastly this algorithm is applied to cross-selling model of mobile communication, and we can verificate the practicality and effectiveness of the algorithm by experiment.%多维关联规则是数据挖掘中的一个重要研究方向,由此提出了一种高效的多维关联规则挖掘算法,该方法通过引入MDPI-tree(多维谓词索引树)结构,有效地将数据立方体技术和频繁项集挖掘算法FP-Growth结合起来,能用于挖掘维间和混合维关联规则.最后将此算法应用于移动通信交叉销售模型,通过实验验证算法的有效性和实用性.

  14. Wetland and waterbody restoration and creation associated with mining

    International Nuclear Information System (INIS)

    Published and unpublished reports are reviewed and the strategies and techniques used to facilitate the establishment of wetlands and waterbodies during mine reclamation are summarized. Although the emphasis is on coal, phosphate, and sand and gravel operations, the methods are relevant to other types of mining and mitigation activities. The following key points should receive attention during planning and mitigation processes: (1) development of site-specific objectives that are related to regional wetland trends; (2) integration of wetland mitigation plans with mining operations and reclamation at the beginning of any project; (3) wetland designs that mimic natural systems and provide flexibility for unforeseen events; (4) assurance that basin morphometry and control of the hydrologic regime are properly addressed before considering other aspects of a project; and (5) identification of mandatory monitoring as a known cost. Well-designed studies that use comparative approaches are needed to increase the database on wetland restoration technology. Meanwhile, regional success criteria for different classes of wetlands need to be developed by consensus agreement among professionals. The rationale for a particular mitigation strategy must have a sound, scientific basis if the needs of mining industries are to be balanced against the necessity of wetland operation. 93 refs., 3 figs

  15. The methodology and problems associated with corrosion testing in South African mines

    Energy Technology Data Exchange (ETDEWEB)

    McEwan, J.J.; Enright, D.P. [Mintek, Randburg (South Africa); Leitch, J.E. [Hulett Aluminium (Pty) Ltd., Pietermaritzburg (South Africa)

    1995-10-01

    The mining industry is of fundamental importance to the South African economy, contributing over 10 per cent of the gross domestic product (GDP). It is one of the largest and oldest industrial sectors in the country, with gold, coal and diamonds as the most valuable exports. The corrosion costs to the South African mining industry are in excess of R1 billion (US$ 300 million) per year. An area of particular concern is in the mine shafts where not only can the conditions be very corrosive, but shaft utilization entails that there is very little time available for maintenance, repair etc. The problems are compounded in the newer ultra-deep gold mines (up to 4 km or 2{1/2} miles deep), where large volumes of sometimes untreated water are required for both the mining operation and cooling. This paper details the corrosion problems encountered in shafts, and also the problems associated with performing in situ evaluations.

  16. 基于包含与演绎分析的无冗余序列规则挖掘%NON-REDUNDANT SEQUENCE RULES MINING BASED ON INCLUSION AND DEDUCTION ANALYSIS

    Institute of Scientific and Technical Information of China (English)

    周新; 王乙民; 刘婧; 尤涛

    2016-01-01

    序列规则挖掘旨在发现频繁序列之间的因果关联,当前最优的序列规则产生方法仅考虑两规则间的包含关系而没有考虑多规则间的演绎关系,故而存在大量冗余。引入演绎无冗余规则的概念,分析演绎冗余的原因,重新定义了无冗余规则的概念。在频繁闭序列及其生成子的基础上,基于最大重叠项冗余性检查给出了无冗余规则抽取算法。理论分析和实验评估表明该算法在处理效率基本不变的前提下,提高了序列规则的生成质量。%Sequence rule mining aims at finding the casual association between frequent sequences,current best sequence rules generation approach just considers the inclusion relationship between two rules but does not consider the deduction relationship among multi rules, therefore has lots redundancies.We introduce the concept of deductive non-redundant rules and analyse the reasons for deductive redundancy, as well as redefine the concept of non-redundant rules.We also present the non-redundant sequence rules extraction algorithm based on the maximum overlap term redundancy checking on the basis of frequent closed sequence and its generator.Theoretical analysis and experimental assessment demonstrate that this algorithm improves the generation quality of sequence rules with almost the same efficiency.

  17. Applications of multi-season hyperspectral remote sensing for acid mine water characterization and mapping of secondary iron minerals associated with acid mine drainage

    Science.gov (United States)

    Davies, Gwendolyn E.

    Acid mine drainage (AMD) resulting from the oxidation of sulfides in mine waste is a major environmental issue facing the mining industry today. Open pit mines, tailings ponds, ore stockpiles, and waste rock dumps can all be significant sources of pollution, primarily heavy metals. These large mining-induced footprints are often located across vast geographic expanses and are difficult to access. With the continuing advancement of imaging satellites, remote sensing may provide a useful monitoring tool for pit lake water quality and the rapid assessment of abandoned mine sites. This study explored the applications of laboratory spectroscopy and multi-season hyperspectral remote sensing for environmental monitoring of mine waste environments. Laboratory spectral experiments were first performed on acid mine waters and synthetic ferric iron solutions to identify and isolate the unique spectral properties of mine waters. These spectral characterizations were then applied to airborne hyperspectral imagery for identification of poor water quality in AMD ponds at the Leviathan Mine Superfund site, CA. Finally, imagery varying in temporal and spatial resolutions were used to identify changes in mineralogy over weathering overburden piles and on dry AMD pond liner surfaces at the Leviathan Mine. Results show the utility of hyperspectral remote sensing for monitoring a diverse range of surfaces associated with AMD.

  18. Quantitative Neuropathology Associated with Chronic Manganese Exposure in South African Mine Workers

    Science.gov (United States)

    Gonzalez-Cuyar, Luis F.; Nelson, Gill; Criswell, Susan R.; Ho, Pokuan; Lonzanida, Jaymes A.; Checkoway, Harvey; Seixas, Noah; Gelman, Benjamin B.; Evanoff, Bradley A.; Murray, Jill; Zhang, Jing; Racette, Brad A.

    2014-01-01

    Manganese (Mn) is a common neurotoxicant associated with a clinical syndrome that includes signs and symptoms referable to the basal ganglia. Despite many advances in understanding the pathophysiology of Mn neurotoxicity in humans, with molecular and structural imaging techniques, only a few case reports describe the associated pathological findings, and all are in symptomatic subjects exposed to relatively high-level Mn. We performed an exploratory, neurohistopathological study to investigate the changes in the corpus striatum (caudate nucleus, putamen, and globus pallidus) associated with chronic low-level Mn exposure in South African Mn mine workers. Immunohistochemical techniques were used to quantify cell density of neuronal and glial components of the corpus striatum in eight South African Mn mine workers without clinical evidence of a movement disorder and eight age-race-gender matched, non-Mn mine workers. There was higher mean microglia density in Mn mine workers than non-Mn mine workers in the globus pallidus external and internal segments [GPe: 1.33 and 0.87 cells per HPF, respectively (p=0.064); GPi: 1.37 and 0.99 cells per HPF, respectively (p=0.250)]. The number of years worked in the Mn mines was significantly correlated with microglial density in the GPi (Spearman's rho 0.886; p=0.019). The ratio of astrocytes to microglia in each brain region was lower in the Mn mine workers than the non-Mn mine workers in the caudate (7.80 and 14.68; p=0.025), putamen (7.35 and 11.11; p=0.117), GPe (10.60 and 16.10; p=0.091) and GPi (9.56 and 12.42; p=0.376). Future studies incorporating more detailed occupational exposures in a larger sample of Mn mine workers will be needed to demonstrate an etiologic relationship between Mn exposure and these pathological findings. PMID:24374477

  19. Mining Linguistic Associations for Emergent Flood Prediction Adjustment

    OpenAIRE

    Michal Burda; Pavel Rusnok; Martin Štěpnička

    2013-01-01

    Floods belong to the most hazardous natural disasters and their disaster management heavily relies on precise forecasts. These forecasts are provided by physical models based on differential equations. However, these models do depend on unreliable inputs such as measurements or parameter estimations which causes undesirable inaccuracies. Thus, an appropriate data-mining analysis of the physical model and its precision based on features that determine distinct situations seems to be helpful in...

  20. Groundwater recovery problems associated with opencast mine backfills

    OpenAIRE

    Reed, S M

    1986-01-01

    The research outlined in this thesis is concerned with the environmental aspects of groundwater re-establishment as a consequence of surface mining. No principal effects which have been identified as being detrimental to the restored land area are as follows; i). The vertical and horizontal displacements of backfill materials following restoration, and ii). The pollution of groundwater from contact with weathered rockfill materials. The research into settlement has attempted to cl...

  1. Research of the Occupational Psychological Impact Factors Based on the Frequent Item Mining of the Transactional Database

    OpenAIRE

    Cheng Dongmei; Zuo Xuejun; Liu Zhaohua

    2015-01-01

    Based on the massive reading of data mining and association rules mining documents, this paper will start from compressing transactional database and propose the frequent complementary item storage structure of the transactional database. According to the previous analysis, this paper will also study the association rules mining algorithm based on the frequent complementary item storage structure of the transactional database. At last, this paper will apply this mining algorithm in the test r...

  2. Air Pollution Monitoring & Tracking System Using Mobile Sensors and Analysis of Data Using Data Mining

    OpenAIRE

    Umesh M. Lanjewar, J. J. Shah

    2012-01-01

    This study proposes air pollution monitoring systemand analysis of pollution data using association ruledata mining technique. Association rule datamining technique aims at finding associationpatterns among various parameters. In this paper,association rule mining is presented for findingassociation patterns among various air pollutants.For this, Apriori algorithm of association rule datamining is used. Apriori is characterized as a level -by-level complete search algorithm. This algorithmis ...

  3. Prevalence and factors associated with obesity amongst employees of open-cast diamond mine in Namibia

    OpenAIRE

    Desderius Haufiku; Hans Justus Amukugo

    2015-01-01

    The study investigated the prevalence and factors associated with obesity amongst employees of Pocket Beaches mine. Obesity rates are increasing at an alarming rate worldwide; 1.2 billion people worldwide are overweight of which 300 million are clinically obese. Of concern, is that obesity is a risk factor for many diseases, including hypertension, diabetes and other forms of cancers. Although there are several mine workers who on reporting to occupational health services for minor ailment ar...

  4. Prospectors and Developers Association of Canada Mining Matters: A Model of Effective Outreach

    Science.gov (United States)

    Hymers, L.; Heenan, S.

    2009-05-01

    Prospectors and Developers Association of Canada Mining Matters is a charitable organization whose mandate is to bring the wonders of Canada's geology and mineral resources to students, educators and industry. The organization provides current information about rocks, minerals, metals, and mining and offers exceptional educational resources, developed by teachers and for teachers that meet Junior, Intermediate and Senior Provincial Earth Science and Geography curriculum expectations. Since 1994, Mining Matters has reached more than 400,000 educators, students, industry representatives, and Aboriginal Youth through Earth Science resources. At the time of the program's inception, members of the Prospectors and Developers Association of Canada (PDAC) realized that their mining and mineral industry expertise could be of help to teachers and students. Consulting experts in education, government, and business, and the PDAC worked together to develop the first Mining Matters Earth Science curriculum kit for Grades 6 and 7 teachers in Ontario. PDAC Mining Matters became the official educational arm of the Association and a charitable organization in 1997. Since then, the organization has partnered with government, industry, and educators to develop bilingual Earth science teaching units for Grades 4 and 7, and senior High School. The teaching units consist of kits that contain curriculum correlated lesson plans, inform bulletins, genuine data sets, rock and mineral samples, equipment and additional instructional resources. Mining Matters offers instructional development workshops for the purposes of training pre-service and in- service educators to use our teaching units in the classroom. The workshops are meant to provide teachers with the knowledge and confidence they need to successfully employ the units in the classroom. Formal mechanisms for resource and workshop evaluations are in place. Overwhelmingly teacher feedback is positive, describing the excellence

  5. 大数据分析中的关联挖掘磁%Data Mining Association in the Data Analysis

    Institute of Scientific and Technical Information of China (English)

    金宗泽; 冯亚丽; 纪博; 张希; 高快

    2014-01-01

    In this era with the amount information explosion ,the big data is more and more close to our lives .Firstly where the big data came from and how to study the big data are introduced .Then ,the framework of the data analysis pro-cessing is introduced and the importance of the big data mining is elaborated .It provided the studying ways of the big data mining ,and the analytic system can analyze the mining scheme ,meanwhile ,the users can use the artificial selection of pa-rameters to manage the parameters for analysis ,selection and retention .In the course of big data analysis ,if we can use min-ing association rules better ,it will bring more value .%在这个信息量爆炸的年代,大数据越来越贴近我们的生活。论文从大数据从何而来、如何研究大数据入手,通过对大数据分析流程框架进行阐述,提出了大数据分析中关联挖掘的重要性。并通过对大数据关联挖掘给出了相应的研究方案,通过系统对其关联模式进行分析,同时也可通过人为的参数选择对研究的参数进行分析、筛选和保留。在大数据分析的过程中,若能很好地利用关联规则的挖掘,将会带来更广阔的实际价值。

  6. Data Mining Techniques: A Source for Consumer Behavior Analysis

    CERN Document Server

    Raorane, Abhijit

    2011-01-01

    Various studies on consumer purchasing behaviors have been presented and used in real problems. Data mining techniques are expected to be a more effective tool for analyzing consumer behaviors. However, the data mining method has disadvantages as well as advantages. Therefore, it is important to select appropriate techniques to mine databases. The objective of this paper is to know consumer behavior, his psychological condition at the time of purchase and how suitable data mining method apply to improve conventional method. Moreover, in an experiment, association rule is employed to mine rules for trusted customers using sales data in a super market industry

  7. Study on structuring the supervision system of coal mine associated with radionuclides in Xinjiang

    International Nuclear Information System (INIS)

    Xinjiang is one of China's rich coal provinces (areas) and it accounts for about 40% national coal reserves. In the long-term radioactive scientific research, monitoring and environmental impact assessment works, we found parts of Yili and Hetian's coal was associated with higher radionuclide, and parts of coal seam even reached nuclear mining level. However the laws and regulations about associated radioactive coal mine supervision were not perfect, and the supervision system is still in the exploration. This article mainly started with the coal mine enterprises' geological prospecting reports, radiation environmental impact assessment and monitoring report preparation for environment acceptance checking and supervisory monitoring, controlled the coal radioactive pollution from the sources, and carried out the research of building Xinjiang associated radioactive coal mine supervision system. The establishment of supervision system will provide technical guidance for the enterprises' coal exploitation and cinders using on the one hand, and on the other hand will provide decision-making basis for strengthening the associated radioactive coal mine supervision for Xinjiang environmental regulators. (authors)

  8. Atomic data mining numerical methods, source code SQlite with Python

    OpenAIRE

    Khwaldeh, Ali; Tahat, Amani; Martí Rabassa, Jordi; Tahat, Mofleh

    2013-01-01

    This paper introduces a recently published Python data mining book (chapters, topics, samples of Python source code written by its authors) to be used in data mining via world wide web and any specific database in several disciplines (economic, physics, education, marketing. etc). The book started with an introduction to data mining by explaining some of the data mining tasks involved classification, dependence modelling, clustering and discovery of association rules. The book addressed that ...

  9. Fast Vertical Mining Using Boolean Algebra

    OpenAIRE

    Hosny M. Ibrahim; Marghny, M. H.; Noha M. A. Abdelaziz

    2015-01-01

    The vertical association rules mining algorithm is an efficient mining method, which makes use of support sets of frequent itemsets to calculate the support of candidate itemsets. It overcomes the disadvantage of scanning database many times like Apriori algorithm. In vertical mining, frequent itemsets can be represented as a set of bit vectors in memory, which enables for fast computation. The sizes of bit vectors for itemsets are the main space expense of the algorithm that restricts its ex...

  10. Mining of Datasets with an Enhanced Apriori Algorithm

    Directory of Open Access Journals (Sweden)

    V. P. Arunachalam

    2012-01-01

    Full Text Available Problem statement: Classical association rules are mostly mining intra-transaction associations i.e., associations among items within the same transaction where the idea behind the transaction could be the items bought by the same customer on the same day. The goal of inter-transaction association rules is to represent the associations between various events found in different transactions. Approach: In this study, we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to N-Dimensional, inter-transaction associations. With the introduction of dimensional attributes, we lose the luxury of simple representational form of the classical association rules. Mining inter-transaction associations pose more challenges on efficient processing than mining intra-transaction associations because the number of potential association rules becomes extremely large after the boundary of transactions is broken. Results: Various tests also conducted using the data set collected from different Stock Exchange (SE.Various experimental results are reported by comparing with real life and synthetic datasets and we show the effectiveness of our work in generating rules and in finding acceptable set of rules under varying conditions. Conclusion/Recommendations: This study introduce the notion of N-Dimensional inter-transaction association rule, define its measurements: support and confidence and develop an efficient algorithm called Modified Apriori.

  11. Multiple simulation experimental studies of gas emission, distribution and migration rules in mine ventilation system and goaf area

    OpenAIRE

    Zhang, Haoran

    2015-01-01

    Gas problems have created severe difficulties for the mining industry around the world, leading to high expenditures and intensity research efforts, and determined attempts to enhance the various ventilation optimization and gas drainage techniques. Meanwhile, gas research is thriving in recent years, and gas drainage technology will continue to be a growing industry over the coming decades in many mining countries. Safety mining technologies including field investigation, numerical simula...

  12. The Contribution of „Ruda 12 Apostoli” Mining Association in Brad to the Development of Transylvanian Gold Mining Between 1884 – 1921

    OpenAIRE

    MIRCEA BARON

    2012-01-01

    One of the major gold mining regions in Romania is part of the gold rectangle in the Apuseni Mountains and lies around the town of Brad. It is here that the ”Ruda 12 Apostoli” Mining Association of cuxas was established at the end of the XVIIIth century. This association was to become the most important unit for the mining of precious metals in the entire Austrian – Hungarian Empire after 1884, when it was taken over by the German company ”Harkortschen Bergwerke und Chemische Fabriken zu Schw...

  13. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  14. Spatial Data Mining using Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ch.N.Santhosh Kumar

    2012-09-01

    Full Text Available Data mining, which is refers to as Knowledge Discovery in Databases(KDD, means a process of nontrivialexaction of implicit, previously useful and unknown information such as knowledge rules, descriptions,regularities, and major trends from large databases. Data mining is evolved in a multidisciplinary field ,including database technology, machine learning, artificial intelligence, neural network, informationretrieval, and so on. In principle data mining should be applicable to the different kind of data and databasesused in many different applications, including relational databases, transactional databases, datawarehouses, object- oriented databases, and special application- oriented databases such as spatialdatabases, temporal databases, multimedia databases, and time- series databases. Spatial data mining, alsocalled spatial mining, is data mining as applied to the spatial data or spatial databases. Spatial data are thedata that have spatial or location component, and they show the information, which is more complex thanclassical data. A spatial database stores spatial data represents by spatial data types and spatialrelationships and among data. Spatial data mining encompasses various tasks. These include spatialclassification, spatial association rule mining, spatial clustering, characteristic rules, discriminant rules,trend detection. This paper presents how spatial data mining is achieved using clustering.

  15. A procedure for NEPA assessment of selenium hazards associated with mining.

    Science.gov (United States)

    Lemly, A Dennis

    2007-02-01

    This paper gives step-by-step instructions for assessing aquatic selenium hazards associated with mining. The procedure was developed to provide the U.S. Forest Service with a proactive capability for determining the risk of selenium pollution when it reviews mine permit applications in accordance with the National Environmental Policy Act (NEPA). The procedural framework is constructed in a decision-tree format in order to guide users through the various steps, provide a logical sequence for completing individual tasks, and identify key decision points. There are five major components designed to gather information on operational parameters of the proposed mine as well as key aspects of the physical, chemical, and biological environment surrounding it--geological assessment, mine operation assessment, hydrological assessment, biological assessment, and hazard assessment. Validation tests conducted at three mines where selenium pollution has occurred confirmed that the procedure will accurately predict ecological risks. In each case, it correctly identified and quantified selenium hazard, and indicated the steps needed to reduce this hazard to an acceptable level. By utilizing the procedure, NEPA workers can be confident in their ability to understand the risk of aquatic selenium pollution and take appropriate action. Although the procedure was developed for the Forest Service it should also be useful to other federal land management agencies that conduct NEPA assessments, as well as regulatory agencies responsible for issuing coal mining permits under the authority of the Surface Mining Control and Reclamation Act (SMCRA) and associated Section 401 water quality certification under the Clean Water Act. Mining companies will also benefit from the application of this procedure because priority selenium sources can be identified in relation to specific mine operating parameters. The procedure will reveal the point(s) at which there is a need to modify operating

  16. Fuzzy association rules for biological data analysis: a case study on yeast

    OpenAIRE

    Cano Carlos; Garcia Fernando; Blanco Armando; Lopez Francisco J; Marin Antonio

    2008-01-01

    Abstract Background Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data. Results In ...

  17. DataAprori algorithm : Implementation of scalable Data Mining by using Aprori algorithm

    OpenAIRE

    M Afshar Alam , Sapna Jain,Ranjit Biswas

    2011-01-01

    Data Mining is concerned with the development and applications of algorithms for discovery of a priori unknown relationships associations, groupings, classifiers from data. Association rule mining (ARM) is a knowledge discovery technique used in various data mining applications. The task of discovering scalable rules from the multidimensional database with reduced support is an area for exploration for research . Pruning is a technique for simplifying and hence generalising a decision tree. E...

  18. Semi-Trusted Mixer Based Privacy Preserving Distributed Data Mining for Resource Constrained Devices

    OpenAIRE

    Xun Yi; Md. Golam Kaosar

    2010-01-01

    In this paper a homomorphic privacy preserving association rule mining algorithm is proposed which can be deployed in resource constrained devices (RCD). Privacy preserved exchange of counts of itemsets among distributed mining sites is a vital part in association rule mining process. Existing cryptography based privacy preserving solutions consume lot of computation due to complex mathematical equations involved. Therefore less computation involved privacy solutions are extremely necessary t...

  19. ON MINING ENTREPRENEURSHIP IN BANOVINA REGION (CROATIA

    Directory of Open Access Journals (Sweden)

    Berislav Šebečić

    2000-12-01

    Full Text Available Mining activities in exploitation of iron, copper, and lead (-silver ores in Trgovska gora Mountain had been developed back in Illyrian and Roman times as well as in the Middle Ages and recent times whereas in Petrova gora Mountain exploitation of iron oreš and coal developed as late as in 19 and 20 centuries. In the Middle Ages and more recent times, Croatian nobility (counts of Zrinski and Keglević and later on also the foreign nobility or foreign and domestic mining associations were given mining concessions.The mining enterprise in the Banovina Region passed to different owners and managers from mid —19 century to mid — 20 century. During the Austro-Hungarian rule the main mining concession was owned by »Gewerkschaft der Eisenbergwerke und Huttenwerke Petrova gora zu Topusko« or its shorter version »Petrova gora Gewerkschaft«. The major mining entrepreneurs on the Trgovska gora Mountain at Bešlinac were Desire Gilain, Joseph Steinauer and Alois Frohm. After the World War I and confiscation of properties of foreign mining associations and entrepreneurs, there were constituted and bankrupted rather quickly the Petrova gora Association of Mines and Foundry at Topusko, the Slavenska Bank Zagreb (until 1923, as well as the Iron Mine and Foundry Inc. at Topusko. After the bancruptey of National Industrial Enterprise Zagreb (1929, the Mining Association and (Iron Foundry was founded at Bešlinac (1934. In the region of Banovina there were operating also: the Kupa-Glina Mining Association (active also during the Austro-Hungarian rule, Mineral Mining Association from Topusko, as vvell as the Iron Mine and Foundry Topusko-Vojnić Headquarters. All the mentioned associations and entrepreneurs were confiscated by the Federal People's Republic of Yugoslavia in 1946.

  20. E-commerce Website Recommender System Based on Dissimilarity and Association Rule

    Directory of Open Access Journals (Sweden)

    MingWang Zhang

    2013-07-01

    Full Text Available By analyzing the current electronic commerce recommendation algorithm analysis, put forward a kind to use dissimilarity clustering and association recommendation algorithm, the algorithm realized web website shopping user data clustering by use of the dissimilarity, and then use the association rules algorithm for clustering results of association recommendation, experiments show that the algorithm compared with traditional clustering association algorithm of iteration times decrease, improve operational efficiency, to prove the method by use of the actual users purchase the recommended, and evidence of the effectiveness of the algorithm in recommendation.  

  1. Integrated Text Mining and Chemoinformatics Analysis Associates Diet to Health Benefit at Molecular Level

    DEFF Research Database (Denmark)

    Jensen, Kasper; Panagiotou, Gianni; Kouskoumvekaki, Irene

    2014-01-01

    , lipids and nutrients. In this work, we applied text mining and Naïve Bayes classification to assemble the knowledge space of food-phytochemical and food-disease associations, where we distinguish between disease prevention/amelioration and disease progression. We subsequently searched for frequently...

  2. Metagenome-wide association studies: fine-mining the microbiome.

    Science.gov (United States)

    Wang, Jun; Jia, Huijue

    2016-08-01

    Metagenome-wide association studies (MWAS) have enabled the high-resolution investigation of associations between the human microbiome and several complex diseases, including type 2 diabetes, obesity, liver cirrhosis, colorectal cancer and rheumatoid arthritis. The associations that can be identified by MWAS are not limited to the identification of taxa that are more or less abundant, as is the case with taxonomic approaches, but additionally include the identification of microbial functions that are enriched or depleted. In this Review, we summarize recent findings from MWAS and discuss how these findings might inform the prevention, diagnosis and treatment of human disease in the future. Furthermore, we highlight the need to better characterize the biology of many of the bacteria that are found in the human microbiota as an essential step in understanding how bacterial strains that have been identified by MWAS are associated with disease. PMID:27396567

  3. Data Mining E-protokol - Applying data mining techniques on student absence

    OpenAIRE

    Shrestha, Amardip; Bro Lilleås, Lauge; Hansen, Asbjørn

    2014-01-01

    The scope of this project is to explore the possibilities in applying data mining techniques for discovering new knowledge about student absenteeism in primary school. The research consists in analyzing a large dataset collected through the digital protocol system E-protokol. The data mining techniques used for the analysis involves clustering, classification and association rule mining, which are utilized using the machine learning toolset WEKA. The findings includes a number of suggestions ...

  4. A study of trends in occupational risks associated with coal mining

    Energy Technology Data Exchange (ETDEWEB)

    Amoundru, C.

    1980-10-01

    The occupational risks associated with underground coal mining can be categorized as either industrial accidents or occupational diseases. Since 1957, the number of fatal accidents per million tons of coal produced has dropped by a factor of four. The number of industrial accidents in general decreased by 30% during 1967-75. The main occupational diseases affecting miners are arthrosis, deafness, and pneumoconiosis. To make an objective comparison with the health hazards from other sources of energy, the probable risks facing workers in a modern mine should be compared with those currently confronting workers in other industries.

  5. Short-term optimal operation of Three-gorge and Gezhouba cascade hydropower stations in non-flood season with operation rules from data mining

    International Nuclear Information System (INIS)

    Highlights: ► Short-term optimal operation of Three-gorge and Gezhouba hydropower stations was studied. ► Key state variable and exact constraints were proposed to improve numerical model. ► Operation rules proposed were applied in population initiation step for faster optimization. ► Culture algorithm with difference evolution was selected as optimization method. ► Model and method proposed were verified by case study with feasible operation solutions. - Abstract: Information hidden in the characteristics and relationship data of a cascade hydropower stations can be extracted by data-mining approaches to be operation rules and optimization support information. In this paper, with Three-gorge and Gezhouba cascade hydropower stations as an example, two operation rules are proposed due to different operation efficiency of water turbines and tight water volume and hydraulic relationship between two hydropower stations. The rules are applied to improve optimization model with more exact decision and state variables and constraints. They are also used in the population initiation step to develop better individuals with culture algorithm with differential evolution as an optimization method. In the case study, total feasible population and the best solution based on an initial population with an operation rule can be obtained with a shorter computation time than that of a pure random initiated population. Amount of electricity generation in a dispatch period with an operation rule also increases with an average increase rate of 0.025%. For a fixed water discharge process of Three-gorge hydropower station, there is a better rule to decide an operation plan of Gezhouba hydropower station in which total hydraulic head for electricity generation is optimized and distributed with inner-plant economic operation considered.

  6. Data Mining as Support to Knowledge Management in Marketing

    OpenAIRE

    Zekić-Sušac Marijana; Has Adela

    2015-01-01

    Background: Previous research has shown success of data mining methods in marketing. However, their integration in a knowledge management system is still not investigated enough. Objectives: The purpose of this paper is to suggest an integration of two data mining techniques: neural networks and association rules in marketing modeling that could serve as an input to knowledge management and produce better marketing decisions. Methods/Approach: Association rules and artificial neural networks ...

  7. Environmental impacts associated with an abandoned mine in the Witbank Coalfield, South Africa

    International Nuclear Information System (INIS)

    Mining at Middelburg Colliery in the Witbank Coalfield commenced at the turn of the last century. Initially, there was little environmental degradation associated with mining activities; however, in the late 1930s, a pillar-robbing programme commenced. This has had a marked effect on the environment. Some of the most notable primary effects include subsidence, the appearance of tension cracks at the surface and crownhole development. Secondary effects include spontaneous combustion of the coal worked, as air has been provided with ready access to the mine, accelerated subsidence due to the strength of many pillars being reduced by burning, and a marked deterioration of groundwater quality in the area due to the seepage of acid mine drainage from the mine. Spoil heaps also form blemishes on the landscape. These contain significant amounts of coal and have undergone spontaneous combustion. The deterioration in the quality of water has led to the decimation of vegetation in some areas and the eradication of aquatic flora and fauna in a nearby stream

  8. Integrated assessmet of the impacts associated with uranium mining and milling

    International Nuclear Information System (INIS)

    The occupational health and safety impacts are assessed for domestic underground mining, open pit mining, and milling. Public health impacts are calculated for a population of 53,000 located within 88 km (55 miles) of a typical southwestern uranium mill. The collective annual dose would be 6.5 man-lung rem/year, 89% of which is from 222Rn emitted from mill tailings. The dose to the United States population is estimated to be 6 x 104 man-lung rem from combined mining and milling operations. This may be comparedd with 5.7 x 105 man-lung rem from domestic use of natural gas and 4.4 x 107 man-lung rem from building interiors. Unavoidable adverse environmental impacts appear to be severe in a 250 ha area surrounding a mill site but negligible in the entire potentially impacted area (500,000 ha). The contemporary uranium resource and supply industry and its institutional settings are described in relation to the socio-economic impacts likely to emerge from high levels of uranium mining and milling. Radon and radon daughter monitoring techniques associated with uranium mining and milling are discussed

  9. Integrated assessmet of the impacts associated with uranium mining and milling

    Energy Technology Data Exchange (ETDEWEB)

    Parzyck, D.C.; Baes, C.F. III; Berry, L.G.

    1979-07-01

    The occupational health and safety impacts are assessed for domestic underground mining, open pit mining, and milling. Public health impacts are calculated for a population of 53,000 located within 88 km (55 miles) of a typical southwestern uranium mill. The collective annual dose would be 6.5 man-lung rem/year, 89% of which is from /sup 222/Rn emitted from mill tailings. The dose to the United States population is estimated to be 6 x 10/sup 4/ man-lung rem from combined mining and milling operations. This may be comparedd with 5.7 x 10/sup 5/ man-lung rem from domestic use of natural gas and 4.4 x 10/sup 7/ man-lung rem from building interiors. Unavoidable adverse environmental impacts appear to be severe in a 250 ha area surrounding a mill site but negligible in the entire potentially impacted area (500,000 ha). The contemporary uranium resource and supply industry and its institutional settings are described in relation to the socio-economic impacts likely to emerge from high levels of uranium mining and milling. Radon and radon daughter monitoring techniques associated with uranium mining and milling are discussed.

  10. [Study on professor Yan Zhenghua's medication regularity in treating heart diseases based on association rules and entropy cluster].

    Science.gov (United States)

    Wu, Jia-rui; Guo, Wei-xian; Zhang, Xiao-meng; Zhang, Bing; Zhang, Yue

    2015-04-01

    In this study, Professor Yan Zhenghua's recipes for treating heart diseases were collected to determine the frequency and association rules among drugs by such data mining methods as apriori algorithm and complex system entropy cluster and summarize Pro- fessor Yan Zhenghua's medication experience in treating heart diseases. The results indicated that frequently used drugs included Salviae Miltiorrhizae Radix et Rhizoma, Parched Ziziphi Spinosae Semen, Polygoni Multiflori Caulis, Ostreae Concha, Poria; frequently used drug combinations included "Ostreae Concha, Draconis Os", "Polygoni Multiflori Caulis, Parched Ziziphi Spinosae Semen" , and "Salviae Miltiorrhizae Radix et Rhizoma, Parched Ziziphi Spinosae Semen". The drug combinations with the confidence of 1 included "Dalbergiae Odoriferae Lignum-->Salviae Miltiorrhizae Radix et Rhizoma", "Allii Macrostemonis Bulbus-->Parched Ziziphi Spinosae Semen", "Draconis Os-->Ostreae Concha", and "Salviae Miltiorrhizac Radix et Rhizoma, Draconis Os-->Ostreae Concha". The core drug combinations included" Chrysanthemi Flos-Gastrodiae Rhizoma-Tribuli Fructus", "Dipsaci Radix-Taxillus sutchuenensis-Achyranthis Bidentatae Radix", and "Margaritifera Concha-Polygoni Multiflori Caulis-Platycladi Semen-Draconis Os". PMID:26281606

  11. Association text classification of mining ItemSet significance%挖掘重要项集的关联文本分类

    Institute of Scientific and Technical Information of China (English)

    蔡金凤; 白清源

    2011-01-01

    针对在关联规则分类算法的构造分类器阶段中只考虑特征词是否存在,忽略了文本特征权重的问题,基于关联规则的文本分类方法(ARC-BC)的基础上提出一种可以提高关联文本分类准确率的ISARC(ItemSet Significance-based ARC)算法.该算法利用特征项权重定义了k-项集重要度,通过挖掘重要项集来产生关联规则,并考虑提升度对待分类文本的影响.实验结果表明,挖掘重要项集的ISARC算法可以提高关联文本分类的准确率.%Text classification technology is an important basis of information retrieval and text mining,and its main task is to mark category according to a given category set.Text classification has a wide range of applications in natural language processing and understanding、information organization and management、information filtering and other areas.At present,text classification can be mainly divided into three groups: based on statistical methods、based on connection method and the method based on rules. The basic idea of the traditional association text classification algorithm associative rule-based classifier by category(ARC-BC) is to use the association rule mining algorithm Apriori which generates frequent items that appear frequently feature items or itemsets,and then use these frequent items as rule antecedent and category is used as rule consequent to form the rule set and then make these rules constitute a classifier.During classifying the test samples,if the test sample matches the rule antecedent,put the rule that belongs to the class counterm to the cumulative confidence.If the confidence of the category counter is the maximum,then determine the test sample belongs to that category. However,ARC-BC algorithm has two main drawbacks:(1) During the structure classifier,it only considers the existence of feature words and ignores the weight of text features for mining frequent itemsets and generated association rules

  12. DATA MINING TECHNIQUES: A SOURCE FOR CONSUMER BEHAVIOR ANALYSIS

    Directory of Open Access Journals (Sweden)

    Abhijit Raorane

    2011-09-01

    Full Text Available Various studies on consumer purchasing behaviors have been presented and used in real problems. Datamining techniques are expected to be a more effective tool for analyzing consumer behaviors. However, thedata mining method has disadvantages as well as advantages.Therefore, it is important to selectappropriate techniques to mine databases. The objective of this paper is to know consumer behavior, hispsychological condition at the time of purchase and how suitable data mining method apply to improveconventional method. Moreover, in an experiment, association rule is employed to mine rules for trustedcustomers using sales data in a super market industry

  13. Applied data mining for business and industry

    CERN Document Server

    Giudici, Paolo

    2009-01-01

    The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications. Introduces data mining methods and applications.Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.Features detailed case studies based on applied projects within industry.Incorporates discussion of data mining software, with case studies a...

  14. Air Pollution Monitoring & Tracking System Using Mobile Sensors and Analysis of Data Using Data Mining

    Directory of Open Access Journals (Sweden)

    Umesh M. Lanjewar, J. J. Shah

    2012-12-01

    Full Text Available This study proposes air pollution monitoring systemand analysis of pollution data using association ruledata mining technique. Association rule datamining technique aims at finding associationpatterns among various parameters. In this paper,association rule mining is presented for findingassociation patterns among various air pollutants.For this, Apriori algorithm of association rule datamining is used. Apriori is characterized as a level -by-level complete search algorithm. This algorithmis applied on data captured by various gas sensorsfor CO, NO2 and SO2 sensors. As association rulemining can produce several sequence rules ofcontaminants, the proposed system design canenhance the reproducibility, reliability andselectivity of air pollution sensor output.

  15. Mycobiota associated with larval mines of Thrypticus truncatus and T. sagittatus (Diptera: Dolichopodidae) on water hyacinth, Eichhornia crassipes, in Argentina.

    Science.gov (United States)

    Thrypticus truncatus is a candidate for biocontrol of water hyacinth; the larvae of this dipteran mine in the petioles and feed on the phloem in the vascular bundles. The mycobiota associated with T. truncatus and T. sagittatus mines was investigated during two surveys undertaken in the spring and a...

  16. Usage of Apriori Algorithm of Data Mining as an Application to Grievous Crimes against Women

    OpenAIRE

    Divya Bansal#1, Lekha Bhambhu

    2013-01-01

    Quantitative data must be converted into qualitative data, for this association algorithm only can apply to it. As association rule deals with frequent item sets as done by many association algorithms such as: Apriori algorithm, that’s why in most real life applications Apriori algorithm is used. In this paperauthor contains the use of association rule mining in extracting patterns that occur frequently within a dataset and showcases the implementation of the Apriori algorithm in mining assoc...

  17. Socioeconomic inequality of cancer mortality in the United States: a spatial data mining approach

    OpenAIRE

    Lam Nina SN; Vinnakota Srinivas

    2006-01-01

    Abstract Background The objective of this study was to demonstrate the use of an association rule mining approach to discover associations between selected socioeconomic variables and the four most leading causes of cancer mortality in the United States. An association rule mining algorithm was applied to extract associations between the 1988–1992 cancer mortality rates for colorectal, lung, breast, and prostate cancers defined at the Health Service Area level and selected socioeconomic varia...

  18. Leaf Associated Microbial Activities in a Stream Affected by Acid Mine Drainage

    Science.gov (United States)

    Schlief, Jeanette

    2004-11-01

    Microbial activity was assessed on birch leaves and plastic strips during 140 days of exposure at three sites in an acidic stream of the Lusatian post-mining landscape, Germany. The sites differed in their degrees of ochre deposition and acidification. The aim of the study was (1) to follow the microbial activities during leaf colonization, (2) to compare the effect of different environmental conditions on leaf associated microbial activities, and (3) to test the microbial availability of leaf litter in acidic mining waters. The activity peaked after 49 days and subsequently decreased gradually at all sites. A formation of iron plaques on leaf surfaces influenced associated microbial activity. It seemed that these plaques inhibit the microbial availability of leaf litter and serve as a microbial habitat by itself. (

  19. Mining Multi-Level Frequent Itemsets under Constraints

    CERN Document Server

    Gouider, Mohamed Salah

    2010-01-01

    Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more useful, more refined and more interpretable by the user. Several algorithms have been proposed in the literature to discover the multilevel association rules. In this article, we are interested in the problem of discovering multi-level frequent itemsets under constraints, involving the user in the research process. We proposed a technique for modeling and interpretation of constraints in a context of use of concept hierarchies. Three approaches for discovering multi-level frequent itemsets under constraints were proposed and discussed: Basic approach, "Test and Generate" approach and Pruning based Approach.

  20. 数据挖掘中的关联规则%The Relationship Rule of Data Mining

    Institute of Scientific and Technical Information of China (English)

    戴稳胜; 匡宏波; 谢邦昌

    2002-01-01

    The paper describes the classification of relationship rule and its effects judging standards and the realization procedure through computers。The paper fully introduces to the relevant knowledge about rela-tionship rule。

  1. Anomaly Detection in XML-Structured SOAP Messages Using Tree-Based Association Rule Mining

    OpenAIRE

    Esfahani, Reyhaneh Ghassem; Azgomi, Mohammad Abadollahi; Fathi, Reza

    2016-01-01

    Web services are software systems designed for supporting interoperable dynamic cross-enterprise interactions. The result of attacks to Web services can be catastrophic and causing the disclosure of enterprises' confidential data. As new approaches of attacking arise every day, anomaly detection systems seem to be invaluable tools in this context. The aim of this work has been to target the attacks that reside in the Web service layer and the extensible markup language (XML)-structured simple...

  2. Mining Association Rules to Evade Network Intrusion in Network Audit Data

    OpenAIRE

    Kamini Nalavade; B. B. Meshram

    2014-01-01

    With the growth of hacking and exploiting tools and invention of new ways of intrusion, intrusion detection and prevention is becoming the major challenge in the world of network security. The increasing network traffic and data on Internet is making this task more demanding. There are various approaches being utilized in intrusion detections, but unfortunately any of the systems so far is not completely flawless. The false positive rates make it extremely hard to analyse and react to attacks...

  3. Text Classification using the Concept of Association Rule of Data Mining

    OpenAIRE

    Rahman, Chowdhury Mofizur; Sohel, Ferdous Ahmed; Naushad, Parvez; Kamruzzaman, S. M.

    2010-01-01

    As the amount of online text increases, the demand for text classification to aid the analysis and management of text is increasing. Text is cheap, but information, in the form of knowing what classes a text belongs to, is expensive. Automatic classification of text can provide this information at low cost, but the classifiers themselves must be built with expensive human effort, or trained from texts which have themselves been manually classified. In this paper we will discuss a procedure of...

  4. Association Rule Mining Based Extraction of Semantic Relations Using Markov Logic Network

    Directory of Open Access Journals (Sweden)

    K.Karthikeyan

    2014-10-01

    Full Text Available Ontology may be a conceptualization of a website into a human understandable, however machine - readable format consisting of entities, attributes, relationships and axioms. Ontologies formalize the in tentional aspects of a site, whereas the denotative part is provided by a mental object that contains assertions about instances of concepts and relations. Semantic relation it might be potential to extract the whole family - tree of a outstanding personalit y employing a resource like Wikipedia. In a way, relations describe the linguistics relationships among the entities involve that is beneficial for a higher understanding of human language. The relation can be identified from the result of concept hierarch y extraction. The existing ontology learning process only produces the result of concept hierarchy extraction. It does not produce the semantic relation between the concepts. Here, we have to do the process of constructing the predicates and also first ord er logic formula. Here, also find the inference and learning weights using Markov Logic Network. To improve the relation of every input and also improve the relation between the contents we have to propose the concept of ARSRE. This method can find the fre quent items between concepts and converting the extensibility of existing lightweight ontologies to formal one. The experimental results can produce the good extraction of semantic relations compared to state - of - art method

  5. A Personalized Collaborative Filtering Recommendation Using Association Rules Mining and Self-Organizing Map

    OpenAIRE

    Hongwu Ye

    2011-01-01

    With the development of the Internet, the problem of information overload is becoming increasing serious. People all have experienced the feeling of being overwhelmed by the number of new books, articles, and proceedings coming out each year. Many researchers pay more attention on building a proper tool which can help users obtain personalized resources. Personalized recommendation systems are one such software tool used to help users obtain recommendations for unseen items based on their pre...

  6. USING HASH BASED APRIORI ALGORITHM TO REDUCE THE CANDIDATE 2- ITEMSETS FOR MINING ASSOCIATION RULE

    OpenAIRE

    K. Vanitha

    2011-01-01

    In this paper we describe an implementation of Hash based Apriori. We analyze, theoretically and experimentally, the principal data structure of our solution. This data structure is the main factor in the efficiency of our implementation. We propose an effective hash-based algorithm for the candidate set generation. Explicitly, the number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performanc...

  7. Mining floating train data sequences for temporal association rules within a predictive maintenance framework

    OpenAIRE

    SAMMOURI, Wissam; COME, Etienne; OUKHELLOU, Latifa; Aknin, Patrice

    2013-01-01

    In order to meet the mounting social and economic demands, railway operators and manufacturers are striving for a longer availability and a better reliability of railway transportation systems. Commercial trains are being equipped with state-of-the-art onboard intelligent sensors monitoring various subsystems all over the train. These sensors provide real-time spatio-temporal data consisting of georeferenced timestamped events that tend sometimes to occur in bursts. Once ordered with respect ...

  8. The Reviewer's Assistant: Recommending Topics to Writers by Association Rule Mining and Case-base Reasoning

    OpenAIRE

    Dong, Ruihai; Schaal, Markus; O'Mahony, Michael P.; Smyth, Barry

    2012-01-01

    Today, online reviews for products and services have become an important class of user-generated content and they play a valuable role for countless online businesses by helping to convert casual browsers into informed and satisfied buyers. As users gravitate towards sites that offer insightful and objective reviews, the ability to source helpful reviews from a community of users is increasingly important. In this extended abstract we describe the Reviewer’s Assistant, a case-based reasoning ...

  9. NIOSH (National Institute for Occupational Safety and Health) testimony to Department of Labor on the Mine Safety and Health Administration proposed rule: ionizing radiation standards for metal and nonmetal mines, August 13, 1987 by R. Niemeier

    International Nuclear Information System (INIS)

    Recommendations were offered for protecting workers against the health effects of ionizing radiation in metal and nonmetal mines. Available data demonstrating such health effects was reviewed and evidence supporting the technical feasibility of reducing the current Mine Safety and Health Administration (MSHA) standard was presented. Five recent studies indicated a significant increase in lung cancer rates associated with radon progeny exposure in underground mines. Additional studies indicated an exposure/response relationship in uranium miners. The influence of smoking on the association between radon progeny exposure and lung cancer was cited. Evidence has indicated that exposure to radon progeny carries a potential risk of developing occupationally induced lung cancer. Risk-assessment data supported the conclusion that miners with the same characteristics as the United States Public Health Service uranium miners cohort and who accrue a cumulative occupational exposure of 120 working level months, would have a lung cancer excess lifetime risk of about 35 to 40 lung cancer deaths per 1000 exposed miners. Modern mining methods using dilution ventilation as well as bulkheading and backfilling techniques make it possible to achieve substantial reductions in the cumulative exposure to radon progeny. Information was provided on sampling strategy, control technology, ventilation systems, respirators, and medical surveillance programs

  10. Discriminative Pattern Identification using Rule Based Approach

    Directory of Open Access Journals (Sweden)

    Bhushan Mahajan*1

    2014-05-01

    Full Text Available Discrimination is bias behavior of people in society; particularly discrimination is based on race, sex, age and cast. Discrimination observed in many areas like labour market, education credit, mortgage and medical. Most of scientist found it in many subject like social sciences economics and law. Discrimination system relies on historical data for making decisions in socially sensitive actions. The technique of Discrimination identification uses information systems based on data mining technology for decision making. Decision making systems and data mining techniques such as association rule mining have been designed and are now used for making automated decisions, like loan granting or denial. Discrimination situations found in dataset in direct and indirect ways. Rules are formed from the dataset using Apriori algorithm and certain parameters such as number of rules, minimum support and confidence. Power of discrimination within rules are calculated by an elift and glift on a classification rule using Alpha and strong Alpha protection. In direct discrimination, the rules are directly extracted from dataset and searched for discriminatory pattern. In indirect discrimination, system needs some background knowledge as a further input and it is used to find unfair treatments. Inference model required for integration of classification rules with background rules. Inference model is a mathematic model. Direct and Indirect Discrimination tested over German credit dataset.

  11. DDMGD: the database of text-mined associations between genes methylated in diseases from different species

    KAUST Repository

    Raies, A. B.

    2014-11-14

    Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD\\'s scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases.

  12. The Stability of Memory Rules Associative with the Mathematical Thinking Core

    Directory of Open Access Journals (Sweden)

    Xiuzhen Wang

    2011-02-01

    Full Text Available Activation of how and where arithmetic operations are displayed in the brain has been observed in various number-processing tasks. However, it remains poorly understood whether stabilized memory of Boolean rules are associated with background knowledge. The present study reviewed behavioral and imaging evidence demonstrating that Boolean problem-solving abilities depend on the core systems of number-processing. The core systems account for a mathematical cultural background, and serve as the foundation for sophisticated mathematical knowledge. The Ebbinghaus paradigm was used to investigate learning-induced changes by functional magnetic resonance imaging (fMRI in a retrieval task of Boolean rules. Functional imaging data revealed a common activation pattern in the left inferior parietal lobule and left inferior frontal gyrus during all Boolean tasks, which has been used for number-processing processing in former studies. All other regional activations were tasks-specific and prominently distributed in the left thalamus, bilateral parahippocampal gyrus, bilateral occipital lobe, and other subcortices during contrasting stabilized memory retrieval of Boolean tasks and number-processing tasks. The present results largely verified previous studies suggesting that activation patterns due to number-processing appear to reflect a basic anatomical substrate of stability of Boolean rules memory, which are derived from a network originally related to the core systems of number-processing.

  13. A Meta-information-Based Method for Rough Sets Rule Parallel Mining%基于元信息的粗糙集规则并行挖掘方法

    Institute of Scientific and Technical Information of China (English)

    苏健; 高济

    2003-01-01

    Rough sets is one important method of data mining. Data mining processes such a great quantity of data inlarge database that the speed of Rough Sets Data Mining Algorithm is critical to Data Mining System. Utilizing net-work computing resources is an effective approach to improve the performance of Data Mining System. This paperproposes the concept of meta-information,which is used to describes the result of Rough Sets Data Mining in informa-tion system,and a meta-information-based method for rule parallel mining. This method decomposes the information-system into a lot of sub-information-system,dispatchs the task of generating meta-information of sub-information-sys-tem to some task performer in the network,and lets them parallel compute meta-information,then synthesizes themeta-information of sub-information-system to the meta-information of information system in the task synthesizer,and finally produces the rule according to the meta-information.

  14. A Novel Data Mining Approach for Information Hiding

    OpenAIRE

    Shikha Sharma

    2012-01-01

    Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To preserve client privacy in the data mining process, a variety of techniques based on random perturbation of data records have been proposed recently. One known fact which is very important in data mining is discovering the association rules from database of transactions where each transaction consists of set of items. Two important t...

  15. An Enhanced Data Mining Technique for Hiding Sensitive Information

    OpenAIRE

    Abhishek Raghuvanshi

    2011-01-01

    Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To preserve client privacy in the data mining process, a variety of techniques based on random perturbation of data records have been proposed recently. One known fact which is very important in data mining is discovering the association rules from database of transactions where each transaction consists of set of items. Two important t...

  16. A framework for trend mining with application to medical data

    OpenAIRE

    Somaraki, Vassiliki

    2013-01-01

    This thesis presents research work conducted in the field of knowledge discovery. It presents an integrated trend-mining framework and SOMA, which is the application of the trend-mining framework in diabetic retinopathy data. Trend mining is the process of identifying and analysing trends in the context of the variation of support of the association/classification rules that have been extracted from longitudinal datasets. The integrated framework concerns all major processes from data prepar...

  17. A Novel Model of Secure Mining with Decision Matrix Technique

    OpenAIRE

    Prasanthi Kolluri; Satyanarayana Mummana

    2014-01-01

    Security in data mining is an important research issue now days. In this paper we are proposing an efficient a novel model of privacy preserving association rule mining approach over data mining with Decision matrix approach and security consideration we are using RSA algorithm for Secure data transmission. In this approach we are reducing the time complexity during finding the patterns by the Decision matrix ,Communication can be done with cipher datasets instead of plain datasets .

  18. Data warehousing and Phases used in Internet Mining

    OpenAIRE

    Jitender Ahlawat; Joni Birla; Mohit Yadav

    2011-01-01

    In this paper, we describe the data warehousing and data mining.Data Warehousing is the process of storing the data on large scaleand Data mining is the process of analyzing data from differentperspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. As massive amount of data is continuously being collected and stored, many industries are becoming interested in mining some patterns (association rules, correlations, cluster...

  19. An investigation of the factors associated with interpretation of mine atmosphere for spontaneous combustion in coal mines

    OpenAIRE

    Adamus, Alois; Šancer, Jindřich; Guřanová, Pavla; Zubíček, Václav

    2011-01-01

    The risk of spontaneous combustion of coal is highly serious especially in gaseous underground coal mines. In many cases such a spontaneous combustion is a source of initiation of methane-explosive mixture with tragic consequences. Early indication of spontaneous combustion and determination of its seat temperature is in a given environment a key part of safety of underground coal mines. A commonly used method for the detection of spontaneous combustion is an interpretation of coal oxidation ...

  20. Gaseous Oxidized Mercury Flux from Substrates Associated with Industrial Scale Gold Mining in Nevada, USA

    Science.gov (United States)

    Miller, M. B.

    2015-12-01

    Gaseous elemental and oxidized mercury (Hg) fluxes were measured in a laboratory setting from substrate materials derived from industrial-scale open pit gold mining operations in Nevada, USA. Mercury is present in these substrates at a range of concentrations (10 - 40000 ng g-1), predominantly of local geogenic origin in association with the mineralized gold ores, but altered and redistributed to a varying degree by subsequent ore extraction and processing operations, including deposition of Hg recently emitted to the atmosphere from large point sources on the mines. Waste rock, heap leach, and tailings material usually comprise the most extensive and Hg emission relevant substrate surfaces. All three of these material types were collected from active Nevada mine sites in 2010 for previous research, and have since been stored undisturbed at the University of Nevada, Reno. Gaseous elemental Hg (GEM) flux was previously measured from these materials under a variety of conditions, and was re-measured in this study, using Teflon® flux chambers and Tekran® 2537A automated ambient air analyzers. GEM flux from dry undisturbed materials was comparable between the two measurement periods. Gaseous oxidized Hg (GOM) flux from these materials was quantified using an active filter sampling method that consisted of polysulfone cation-exchange membranes deployed in conjunction with the GEM flux apparatus. Initial measurements conducted within greenhouse laboratory space indicate that in dry conditions GOM is deposited to relatively low Hg cap and leach materials, but may be emitted from the much higher Hg concentration tailings material.

  1. An Extensive Review of Significant Researches in Data Mining

    Directory of Open Access Journals (Sweden)

    Paul P. Mathai

    2014-06-01

    Full Text Available An action that removes a few novel nontrivial data enclosed in large databases is defined as Data Mining. On noticing the statistical connections between the items that are more regular in the operation databases traditional data mining methods have spotlighted mostly. Numerous functions are using data mining in dissimilar fields like medical, marketing and so on commonly. Several methods and techniques have been extended for mine the in order from the databases. In this study, we provide a comprehensive survey and study of various methods in existence for item set mining based on the utility and frequency and association rule mining based research works and also presented a brief introduction about data mining and its advantages. Moreover we present a concise description about the Data Mining techniques, performance review and the instructions for future research.

  2. Mining Long, Sharable Patterns in Trajectories of Moving Objects

    DEFF Research Database (Denmark)

    Gidofalvi, Gyozo; Pedersen, Torben Bach

    2009-01-01

    The efficient analysis of spatio-temporal data, generated by moving objects, is an essential requirement for intelligent location-based services. Spatio-temporal rules can be found by constructing spatio-temporal baskets, from which traditional association rule mining methods can discover spatio......-scale synthetic data show the effectiveness of the method and its variants....

  3. 基于关联规则的软件多故障定位技术%Software-based multi-fault location technology based on association rule

    Institute of Scientific and Technical Information of China (English)

    张泽林; 赵洋

    2015-01-01

    为了提高软件故障的定位效率,提出一种基于关联规则的软件多故障定位技术。通过使用聚类方法把失败的测试用例分成针对特定错误的聚类,使用基于交叉表的软件故障定位方法发现软件中的故障,在定位过程中使用关联规则挖掘高可疑代码与软件故障的关系,提高故障定位的效率,最后对Siemens用例集和Tarantula方法进行对比。实验表明基于关联规则的软件多故障定位技术在软件多故障定位方面效率优于Tarantula方法。%In order to improve the efficiency of software⁃based fault localization,a software⁃based multi⁃fault localization technology based on association rule is proposed in this paper. With the clustering method,the failed test cases are sorted into clusters of specific errors,and then the software⁃based fault location method based on crosstab is used to find software faults. In positioning process,association rule is adopted to mine the relationship between high suspicious code and software failure to im⁃prove the efficiency of fault location. Finally,the proposed method and Tarantula method are compared on Siemens case set. The experiment results show that the multi⁃fault software location technology based on association rule is more efficient than Tarantu⁃la method.

  4. DiMeX: A Text Mining System for Mutation-Disease Association Extraction.

    Science.gov (United States)

    Mahmood, A S M Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K

    2016-01-01

    The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http://biotm.cis.udel.edu/dimex/. We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases. PMID:27073839

  5. Injury Profiles Associated with Artisanal and Small-Scale Gold Mining in Tarkwa, Ghana

    Directory of Open Access Journals (Sweden)

    Benedict N. L. Calys-Tagoe

    2015-07-01

    Full Text Available Artisanal and small-scale gold mining (ASGM is inherently risky, but little is known about mining-associated hazards and injuries despite the tremendous growth worldwide of ASGM and the benefits it offers. The current study aimed to characterize the physical injuries associated with ASGM in Ghana to guide policy formulation. A cross-sectional survey was carried out in the Tarkwa mining district of the Western Region of Ghana in 2014. A total of 404 small-scale miners were recruited and interviewed regarding their occupational injury experiences over the preceding 10 years using a paper-based structured questionnaire. Nearly one-quarter (23.5% of the miners interviewed reported getting injured over the previous 10 years, and the overall injury rate was calculated to be 5.39 per 100 person years. The rate was significantly higher for women (11.93 per 100 person years and those with little mining experience (e.g., 25.31 per 100 person years for those with less than one year of work experience. The most injury-prone mining activities were excavation (58.7% and crushing (23.1%, and over 70% of the injuries were reported to be due to miners being hit by an object. The majority of the injuries (57% were lacerations, and nearly 70% of the injuries were to the upper or lower limbs. Approximately one-third (34.7% of the injuries resulted in miners missing more than two weeks of work. One-quarter of the injured workers believed that abnormal work pressure played a role in their injuries, and nearly two-fifths believed that their injuries could have been prevented, with many citing personal protective equipment as a solution. About one-quarter of the employees reported that their employers never seemed to be interested in the welfare or safety of their employees. These findings greatly advance our understanding of occupational hazards and injuries amongst ASGM workers and help identify several intervention points.

  6. Mining in Health Data by GUHA Method

    Czech Academy of Sciences Publication Activity Database

    Rauch, Jan

    Berlin: -, 2006 - (Ackermann, M.; Soares, C.; Guidemann, B.), s. 69-72 [ECML/PKDD 2006 Workshop on Practical Data Mining. Berlin (DE), 18.09.2006-22.09.2006] R&D Projects: GA MŠk 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : medical data mining * GUHA method * association rules Subject RIV: BB - Applied Statistics, Operational Research

  7. Data Mining for Quality Prediction in Textile Engineering

    Institute of Scientific and Technical Information of China (English)

    YANG Jian-guo; LI Bei-zhi; ZHAO Ya-mei

    2006-01-01

    A data mining method for quality prediction using association rule (DMAR) is presented in this paper.Association rule is used to mine the valuable relations of items among amounts of textile process data for ANN prediction model. DMAR consists of three main steps: setup knowledge data set; data cleaning and converting; find the item set with large supports and generate the expected rules.DMAR effectively improves the precision of prediction in yarn breaking. It rapidly gets rid of the negative influence of training parameters on prediction model. Then more satisfactory quality prediction result can be reached.

  8. Efficient Mining of Frequent Closures with Precedence Links and Associated Generators

    OpenAIRE

    Szathmary, Laszlo; Valtchev, Petko; Napoli, Amedeo

    2008-01-01

    The effective construction of many association rule bases require the computation of frequent closures, generators, and precedence links between closures. However, these tasks are rarely combined, and no scalable algorithm exists at present for their joint computation. We propose here a method that solves this challenging problem in two separated steps. First, we introduce a new algorithm called Touch for finding frequent closed itemsets (FCIs) and their generators (FGs). Touch applies depth-...

  9. Integrated Text Mining and Chemoinformatics Analysis Associates Diet to Health Benefit at Molecular Level.

    OpenAIRE

    Jensen, Kasper; Panagiotou, Gianni; Kouskoumvekaki, Irene

    2014-01-01

    Awareness that disease susceptibility is not only dependent on genetic make up, but can be affected by lifestyle decisions, has brought more attention to the role of diet. However, food is often treated as a black box, or the focus is limited to few, well-studied compounds, such as polyphenols, lipids and nutrients. In this work, we applied text mining and Naïve Bayes classification to assemble the knowledge space of food-phytochemical and food-disease associations, where we distinguish betwe...

  10. Integrated text mining and chemoinformatics analysis associates diet to health benefit at molecular level.

    OpenAIRE

    Kasper Jensen; Gianni Panagiotou; Irene Kouskoumvekaki

    2014-01-01

    Awareness that disease susceptibility is not only dependent on genetic make up, but can be affected by lifestyle decisions, has brought more attention to the role of diet. However, food is often treated as a black box, or the focus is limited to few, well-studied compounds, such as polyphenols, lipids and nutrients. In this work, we applied text mining and Naïve Bayes classification to assemble the knowledge space of food-phytochemical and food-disease associations, where we distinguish betwe...

  11. Enterprise Human Resources Information Mining Based on Improved Apriori Algorithm

    Directory of Open Access Journals (Sweden)

    Lei He

    2013-05-01

    Full Text Available With the unceasing development of information and technology in today’s modern society, enterprises’ demand of human resources information mining is getting bigger and bigger. Based on the enterprise human resources information mining situation, this paper puts forward a kind of improved Apriori algorithm based model on the enterprise human resources information mining, this model introduced data mining technology and traditional Apriori algorithm, and improved on its basis, divided the association rules mining task of the original algorithm into two subtasks of producing frequent item sets and producing rule, using SQL technology to directly generating frequent item sets, and using the method of establishing chart to extract the information which are interested to customers. The experimental results show that the improved Apriori algorithm based model on the enterprise human resources information mining is better in efficiency than the original algorithm, and the practical application test results show that the improved algorithm is practical and effective.

  12. A Novel Privacy Preserving Mining with Hybrid Pattern Mining and Key Generation

    OpenAIRE

    B. Ajit; C.P.V.N.J Mohan Rao; Sairam Vakkalanka

    2014-01-01

    Association rule mining over horizontal partitioning data is always an interesting research issue in the field of knowledge and data engineering. Data holder forwards the data sets to centralized server, privacy can be maintained by security protocols, and our security protocol communicates in terms of subsets from both the data holders and player as ingredients for secure transmission. Association rules can be generated at centralized server efficiently. In this paper we are proposing a priv...

  13. An Improved Pearson’s Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes

    OpenAIRE

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the assoc...

  14. Radio-Ecological Situation in the Area of the Priargun Production Mining and Chemical Association - 13522

    Energy Technology Data Exchange (ETDEWEB)

    Semenova, M.P.; Seregin, V.A.; Kiselev, S.M.; Titov, A.V. [FSBI SRC A.I. Burnasyan Federal Medical Biophysical Center of FMBA of Russia, Zhivopisnaya Street, 46, Moscow (Russian Federation); Zhuravleva, L.A. [FSHE ' Centre of Hygiene and Epidemiology no. 107' under FMBA of Russia (Russian Federation); Marenny, A.M. [Ltd ' Radiation and Environmental Researches' (Russian Federation)

    2013-07-01

    'The Priargun Production Mining and Chemical Association' (hereinafter referred to as PPMCA) is a diversified mining company which, in addition to underground mining of uranium ore, carries out refining of such ores in hydrometallurgical process to produce natural uranium oxide. The PPMCA facilities are sources of radiation and chemical contamination of the environment in the areas of their location. In order to establish the strategy and develop criteria for the site remediation, independent radiation hygienic monitoring is being carried out over some years. In particular, this monitoring includes determination of concentration of the main dose-forming nuclides in the environmental media. The subjects of research include: soil, grass and local foodstuff (milk and potato), as well as media of open ponds (water, bottom sediments, water vegetation). We also measured the radon activity concentration inside surface workshops and auxiliaries. We determined the specific activity of the following natural radionuclides: U-238, Th-232, K-40, Ra-226. The researches performed showed that in soil, vegetation, groundwater and local foods sampled in the vicinity of the uranium mines, there is a significant excess of {sup 226}Ra and {sup 232}Th content compared to areas outside the zone of influence of uranium mining. The ecological and hygienic situation is as follows: - at health protection zone (HPZ) gamma dose rate outdoors varies within 0.11 to 5.4 μSv/h (The mean value in the reference (background) settlement (Soktui-Molozan village) is 0.14 μSv/h); - gamma dose rate in workshops within HPZ varies over the range 0.14 - 4.3 μSv/h. - the specific activity of natural radionuclides in soil at HPZ reaches 12800 Bq/kg and 510 Bq/kg for Ra-226 and Th-232, respectively. - beyond HPZ the elevated values for {sup 226}Ra have been registered near Lantsovo Lake - 430 Bq/kg; - the radon activity concentration in workshops within HPZ varies over the range 22 - 10800 Bq

  15. Radio-Ecological Situation in the Area of the Priargun Production Mining and Chemical Association - 13522

    International Nuclear Information System (INIS)

    'The Priargun Production Mining and Chemical Association' (hereinafter referred to as PPMCA) is a diversified mining company which, in addition to underground mining of uranium ore, carries out refining of such ores in hydrometallurgical process to produce natural uranium oxide. The PPMCA facilities are sources of radiation and chemical contamination of the environment in the areas of their location. In order to establish the strategy and develop criteria for the site remediation, independent radiation hygienic monitoring is being carried out over some years. In particular, this monitoring includes determination of concentration of the main dose-forming nuclides in the environmental media. The subjects of research include: soil, grass and local foodstuff (milk and potato), as well as media of open ponds (water, bottom sediments, water vegetation). We also measured the radon activity concentration inside surface workshops and auxiliaries. We determined the specific activity of the following natural radionuclides: U-238, Th-232, K-40, Ra-226. The researches performed showed that in soil, vegetation, groundwater and local foods sampled in the vicinity of the uranium mines, there is a significant excess of 226Ra and 232Th content compared to areas outside the zone of influence of uranium mining. The ecological and hygienic situation is as follows: - at health protection zone (HPZ) gamma dose rate outdoors varies within 0.11 to 5.4 μSv/h (The mean value in the reference (background) settlement (Soktui-Molozan village) is 0.14 μSv/h); - gamma dose rate in workshops within HPZ varies over the range 0.14 - 4.3 μSv/h. - the specific activity of natural radionuclides in soil at HPZ reaches 12800 Bq/kg and 510 Bq/kg for Ra-226 and Th-232, respectively. - beyond HPZ the elevated values for 226Ra have been registered near Lantsovo Lake - 430 Bq/kg; - the radon activity concentration in workshops within HPZ varies over the range 22 - 10800 Bq/m3. The seasonal dependence of

  16. Research on Product Family Configuration Based on Multidimensional Association Rules%基于多维关联规则的产品族配置研究

    Institute of Scientific and Technical Information of China (English)

    罗妤; 郭钢; 徐建萍

    2011-01-01

    A product configuration method based on association rule was introduced aiming at the choice of optional parts in product configuration of large complex products.A component-constraint-rule data warehouse(CCRDW) could be built with the information of product family BOM(bill of material) structure.According to the parameters of customer requirements,a data cube from the data warehouse could be established and by using multidimensional association rule data mining algorithm on the data cube,the appropriate components in materials store would be found.The configuration method of a gear box and its instance was presented.The method has advantages in enhancing the efficiency and reusability of the components.%针对产品配置中可选零部件的选择问题,提出了基于多维关联规则的产品族配置方法:根据产品族BOM结构,构建零部件约束规则数据仓库,设计人员根据客户的产品需求参数建立数据立方,并运用多维关联规则挖掘出物料库中潜在的、能满足配置需求的物料信息,实现产品的个性化配置。实例验证了该配置方法的可行性。该配置方法有效地提高了产品配置效率及零部件的重用性。

  17. Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the automated assignment of natural language texts to predefined categories based on their content. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Now a day the demand of text classification is increasing tremendously. Keeping this demand into consideration, new and updated techniques are being developed for the purpose of automated text classification. This paper presents a new algorithm for text classification. Instead of using words, word relation i.e. association rules is used to derive feature set from pre-classified text documents. The concept of Naive Bayes Classifier is then used on derived features and finally a concept of Genetic Algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental ...

  18. Prevalence and factors associated with obesity amongst employees of open-cast diamond mine in Namibia

    Directory of Open Access Journals (Sweden)

    Desderius Haufiku

    2015-09-01

    Full Text Available The study investigated the prevalence and factors associated with obesity amongst employees of Pocket Beaches mine. Obesity rates are increasing at an alarming rate worldwide; 1.2 billion people worldwide are overweight of which 300 million are clinically obese. Of concern, is that obesity is a risk factor for many diseases, including hypertension, diabetes and other forms of cancers. Although there are several mine workers who on reporting to occupational health services for minor ailment are found to be overweight or obese, we are not certain about the extent with the problem. The health risk associated with obesity could cause a big loss to NAMDEB in terms of care cost, low productivity and absenteeism. The aim of this study was to investigate the prevalence and determinants of obesity amongst NAMDEB employees working at Pocket Beaches diamond mine.a descriptive; cross-sectional study measured the prevalence of obesity and describes the factors that are associated with obesity and overweight. Study population: NAMDEB employees who were working at Pocket Beaches mine. A simple random sampling technique was used to select participants. Eighty seven employees were selected from 188 total NAMDEB employees working at Pocket Beaches mine. Data was collected through interviews. Anthropometric measurements namely, weight, height and abdominal circumference were collected using a standard protocol. Data was analyzed using Epi Info 2002. Body Mass Index (BMI was calculated as kg/m2. Overweight was defined as BMI = 25 to 29.9 kg/m2 and obesity as BMI ≥ 30 kg/m2. Waist Circumference ≥80 cm was used to identify central obesity in women and ≥90 cm in men. The frequency of participation in physical activity, barriers to physical activity and food consumption is reported in percent and means. The study found prevalence 42% overweight and 32% obesity among employees of NAMDEB. A significant number of participants 48% never participate in moderate

  19. A Knowledge Mining Model for Ranking Institutions using Rough Computing with Ordering Rules and Formal Concept Analysis

    Directory of Open Access Journals (Sweden)

    D P Acharjya

    2011-03-01

    Full Text Available Emergences of computers and information technological revolution made tremendous changes in the real world and provides a different dimension for the intelligent data analysis. Well formed fact, the information at right time and at right place deploy a better knowledge. However, the challenge arises when larger volume of inconsistent data is given for decision making and knowledge extraction. To handle such imprecise data certain mathematical tools of greater importance has developed by researches in recent past namely fuzzy set, intuitionistic fuzzy set, rough Set, formal concept analysis and ordering rules. It is also observed that many information system contains numerical attribute values and therefore they are almost similar instead of exact similar. To handle such type of information system, in this paper we use two processes such as pre process and post process. In pre process we use rough set on intuitionistic fuzzy approximation space with ordering rules for finding the knowledge whereas in post process we use formal concept analysis to explore better knowledge and vital factors affecting decisions.

  20. GWA study data mining and independent replication identify cardiomyopathy-associated 5 (CMYA5) as a risk gene for schizophrenia

    DEFF Research Database (Denmark)

    Chen, X; Lee, G; Maher, B S;

    2011-01-01

    We conducted data-mining analyses using the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) and molecular genetics of schizophrenia genome-wide association study supported by the genetic association information network (MGS-GAIN) schizophrenia data sets and performed bioinform...

  1. Fusing Data Mining, Machine Learning and Traditional Statistics to Detect Biomarkers Associated with Depression.

    Directory of Open Access Journals (Sweden)

    Joanna F Dipnall

    Full Text Available Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study.The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010. Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators.After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30, serum glucose (OR 1.01; 95% CI 1.00, 1.01 and total bilirubin (OR 0.12; 95% CI 0.05, 0.28. Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016, and current smokers (p<0.001.The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and complex survey sampling

  2. Pollution of the stream waters and sediments associated with the Crucea uranium mine (East Carpathians, Romania)

    Science.gov (United States)

    Petrescu, L.; Bilal, E.; Iatan, E. L.

    2009-04-01

    standards limits. The uranium concentration ranged from a value of 0.016-mg•L-1 to 1.43-mg•L-1, with a mean of 0.365-mg•L-1. A remarkably good correlation exists between dissolved U and the total anion concentrations, indicating that uranium in these stream waters derived mainly from oxidation of uraniferous bitumen and/or dissolution of carbonates. Based on the correlation dependence (r= 0.69) between U and the sum of Ca + Mg + K + Na major cations and the linear correlation (r= 0.70) between U and silica, we find silicate weathering as an additional source of soluble uranium. The concentrations of dissolved Th are quite low, with median values of 0.015- mg•L-1. The linear variation of dissolved thorium concentration with carbonate alkalinity (r = 0.86) strongly suggests that these concentrations are due to the increase alkalinity. The metals released (U, Th and Pb) are amplified by mining activities. The pollution degree of the sediments was classified using the index of geo-accumulation (Igeo). The Igeo of U, Th and Pb presents medium and punctual high values that represent sediments with strongly to extremely polluted classification (Igeo > 6), while the rest of the elements presents concentration close to the background values or lowers to them. 71% of uranium from bottom sediments is present as primary fractions and 21% is associated to carbonates. Thorium resulted even more insoluble (94% in primary fractions). In view of the substantial mobility and bioavailability of the fractions, this is not an alarming feature. Although neither U nor Th has an appreciable "exchangeable" fraction, the isolation of specific U- and Th-rich sediment fractions helped to identify connections between bioavailability and genesis of sediments, which control ecosystem cycling of U and Th. The measurements carried out in the surroundings of a local uranium mine show that the impact of Crucea mine on water quality downstream of mining area is insignificant.

  3. Solubility relationships of aluminium and iron minerals associated with acid mine drainage

    International Nuclear Information System (INIS)

    The ability to properly manage the oxidation of pyritic minerals and associated acid mine drainage is dependent upon understanding the chemistry of the disposal environment. One accepted disposal method is placing pyritic-containing materials in the groundwater environment. The objective of this study was to examine solubility relationships of Al and Fe minerals associated with pyritic waste disposed in a low leaching aerobic saturated environment. Two eastern oil shales were used in this oxidizing equilibration study, a New Albany Shale (unweathered, 4.6 percent pyrite), and a Chattanooga Shale (weathered, 1.5 percent pyrite). Oil shale samples were equilibrated with distilled-deionized water from 1 to 180 d with a 1:1 solid-to-solution ratio. The suspensions were filtered and the clear filtrates were analyzed for total cations and anions. Ion activities were calculated from total concentrations. Below pH 6.0, depending upon SO42- activity, Al3+ solubility was controlled by AlOHSO4 (solid phase) for both shales. The results of this study indicate that below pH 6.0, Al3+ and Fe3+ solubilities, are limited by basic Al and Fe sulfate solid phases (AlOHSO4(s) and FeHSO4(s)). The results from this study further indicate that the acidity in oil shale waters is produced from the hydrolysis of Al3+ and Fe3+ activities in solution. These results indicate a fundamental change in the stoichiometric equations used to predict acidity from iron sulfide oxidation. The results of this study also indicate that water quality predictions associated with acid mine drainage can be based on fundamental thermodynamic relationships. As a result, waste management decisions can be based on waste-specific/site test methods

  4. Extract Knowledge and Association Rule from Free Log Data using an Apriori Algorithm

    Directory of Open Access Journals (Sweden)

    Hemant N. Randhir

    2013-09-01

    Full Text Available This paper aims to present technique to make private log information public and apply Apriori algorithm on collected log file to extract knowledge from public and free log files with Web Usages Mining Technique.

  5. Extract Knowledge and Association Rule from Free Log Data using an Apriori Algorithm

    OpenAIRE

    Hemant N. Randhir; Ravindra Gupta,; G.R. Selokar

    2013-01-01

    This paper aims to present technique to make private log information public and apply Apriori algorithm on collected log file to extract knowledge from public and free log files with Web Usages Mining Technique.

  6. Research of the Occupational Psychological Impact Factors Based on the Frequent Item Mining of the Transactional Database

    Directory of Open Access Journals (Sweden)

    Cheng Dongmei

    2015-01-01

    Full Text Available Based on the massive reading of data mining and association rules mining documents, this paper will start from compressing transactional database and propose the frequent complementary item storage structure of the transactional database. According to the previous analysis, this paper will also study the association rules mining algorithm based on the frequent complementary item storage structure of the transactional database. At last, this paper will apply this mining algorithm in the test results analysis module of team psychological health assessment system, and will extract the relationship between each psychological impact factor, so as to provide certain guidance for psychologists in their mental illness treatment.

  7. Web Log Mining using Improved Version of Proposed Algorithm

    OpenAIRE

    Manish Shrivastava; Kapil Sharma; Angad Singh

    2011-01-01

    Association Rule mining is one of the important and most popular data mining technique. It extracts interesting correlations, frequent patterns and associations among sets of items in the transaction databases or other data repositories. Most of the existing algorithms require multiple passes over the database for discovering frequent patterns resulting in a large number of disk reads and placing a huge burden on the input/output subsystem. In order to reduce repetitive disk read, a novel met...

  8. 文本挖掘探讨青风藤用药规律研究%Treatment Rules of Sinomenium Acutum by Text Mining

    Institute of Scientific and Technical Information of China (English)

    李雨彦; 郑光; 刘良

    2015-01-01

    Objective:The study summarized the treatment rules of Sinomenium acutum (Menispermaceae,SA)using text mining techniques.Methods:Firstly,we conducted text-mining by collecting related literatures about SA from Chinese Biomedical Litera-ture (CBM)Database.Then structured query language was used to do data processing as well as data stratification.Algorithm was used to analyze the basic laws of symptom,TCM pattern,TCM herb compatibility and drug combination.Results:Sinomenium Acutum was mainly used to treat diseases with symptoms such as ache,swelling,stiffness,malformation,etc.Wind,cold,wet-ness,heat,sputum,stasis and deficiency were the main etiology and pathology.Sinomenium Acutum was always used in combina-tion with herbs with the functions of dispelling wind and eliminating dampness,nourishing the blood and promoting blood circula-tion,dredging collaterals,warming meridians and nourishing kidney.Conclusion:By text mining we summarized the treatment rules of Sinomenium Acutum in a systematic,comprehensive and precise way,providing literature basis for future clinical applica-tion and drug research.%目的:基于文本挖掘技术探讨青风藤用药规律。方法:在 CBM数据库中检索、下载所有涉及青风藤的文献,通过清洗、降噪及关键词频统计的数据分层算法,挖掘青风藤治疗疾病的规律,症状、证型的分布规律,中药配伍、中成药、西药、汤剂、针灸联用规律,并进行规律的可视化展示。结果:青风藤主要治疗以疼痛、肿胀、强直、畸形为主的病证,中医病证要素涉及风、寒、湿、热、痰、瘀、虚。疾病以现代医学的类风湿关节炎为主,涉及多种风湿类疾病以及慢性肾炎、肝炎、心律失常等。中药应用方面,青风藤多与祛风除湿类、养血活血类、通络类、温经类及补肾类中药合用。此外,青风藤多与雷公藤多苷、活络丸等调节免疫、通络药物联用。结论:数据

  9. Redescription Mining With Three Primary Data Mining Functionalities

    Directory of Open Access Journals (Sweden)

    M. Kamala Kumari

    2012-09-01

    Full Text Available Describing an object in two ways or shifting the vocabulary of the same concept is Redescription. Not anew problem, Redescription Mining premise had resulted the subsets of objects that afford multipledefinitions, in a given Universal set of the same, and a collection of features to describe them. Now-a-days,huge amounts of data available either to classify or to categorize leads us to ambiguous state as it isaccomplished with complementary and contradictory ways. Hence data has to be reduced. This involvescataloging, classification, identifying rules among the data, segmentation or partitioning of the data. TheLearning algorithms of data mining techniques on this data can often be viewed as a further form of datareduction. This Sine-qua-non data has been characterized by the multitude of descriptors. In a way, thesedescriptors are also made equivalent and hence reduced. The methodology of redescriptions can beobtained in scores of data mining techniques. In this paper we overview how data mining functionalitieslike classification, clustering and Association rule mining achieve the goal of redecsriptions.

  10. 78 FR 8821 - Abandoned Mine Land Reclamation Program; Limited Liability for Noncoal Reclamation by Certified...

    Science.gov (United States)

    2013-02-06

    ... environmental problems associated with abandoned mine lands include surface and ground water pollution... February 6, 2013 Part IV Department of the Interior Office of Surface Mining Reclamation and Enforcement 30... / Wednesday, February 6, 2013 / Proposed Rules#0;#0; ] DEPARTMENT OF THE INTERIOR Office of Surface...

  11. Law 19.126. It dictate Regulatory standards about Mining of great bearing

    International Nuclear Information System (INIS)

    It statute rules for regulating mining projects of great size, ownership, location, related mining activities, mine closure plan, exploitation concession contract, taxation regime, canon, infractions and sanctions

  12. Identifying the association rules between clinicopathologic factors and higher survival performance in operation-centric oral cancer patients using the Apriori algorithm.

    Science.gov (United States)

    Tang, Jen-Yang; Chuang, Li-Yeh; Hsi, Edward; Lin, Yu-Da; Yang, Cheng-Hong; Chang, Hsueh-Wei

    2013-01-01

    This study computationally determines the contribution of clinicopathologic factors correlated with 5-year survival in oral squamous cell carcinoma (OSCC) patients primarily treated by surgical operation (OP) followed by other treatments. From 2004 to 2010, the program enrolled 493 OSCC patients at the Kaohsiung Medical Hospital University. The clinicopathologic records were retrospectively reviewed and compared for survival analysis. The Apriori algorithm was applied to mine the association rules between these factors and improved survival. Univariate analysis of demographic data showed that grade/differentiation, clinical tumor size, pathology tumor size, and OP grouping were associated with survival longer than 36 months. Using the Apriori algorithm, multivariate correlation analysis identified the factors that coexistently provide good survival rates with higher lift values, such as grade/differentiation = 2, clinical stage group = early, primary site = tongue, and group = OP. Without the OP, the lift values are lower. In conclusion, this hospital-based analysis suggests that early OP and other treatments starting from OP are the key to improving the survival of OSCC patients, especially for early stage tongue cancer with moderate differentiation, having a better survival (>36 months) with varied OP approaches. PMID:23984353

  13. Data warehousing and Phases used in Internet Mining

    Directory of Open Access Journals (Sweden)

    Jitender Ahlawat

    2011-08-01

    Full Text Available In this paper, we describe the data warehousing and data mining.Data Warehousing is the process of storing the data on large scaleand Data mining is the process of analyzing data from differentperspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. As massive amount of data is continuously being collected and stored, many industries are becoming interested in mining some patterns (association rules, correlations, clusters etc from their database. Association rule mining is one of the important tasks that are used to find out the frequent itemset from customer ransactional database. Each transaction consists of items purchased by a customer in a visit. Internet mining is the application of data mining techniques to discover patterns from the Internet. Internet Usage Mining (IUM is the process of application of data mining techniques over web data. The data sources are mainly the web server logs, proxy server logs and cookies stored in the user’s computer. IUM is composed of three phases namely, preprocessing, pattern discovery and pattern analysis. This paper describes these phases in detail. A necessary introduction to Internet Mining is also provided for the purpose of background knowledge.

  14. Analysis on Composition Rules of TCM Tranquilizer Based on Association Rules and Clustering Algorithm%基于关联规则与熵聚类的安神类中成药组方规律研究

    Institute of Scientific and Technical Information of China (English)

    吴嘉瑞; 金燕萍; 张晓朦; 张冰; 盛晓光

    2015-01-01

    目的:分析常用安神类中成药的处方用药规律。方法:收集《新编国家中成药》中的安神类药品处方,基于中医传承辅助系统建立处方数据库,采用关联规则apriori算法、复杂系统熵聚类等方法开展研究,确定处方中各种药物的使用频次及药物之间的关联规则等。结果:高频次药物包括茯苓、甘草、当归、麦冬、朱砂等;高频次药物组合包括“当归、茯苓”“茯苓、炒酸枣仁”“甘草、茯苓”等;置信度较高的关联规则包括“牛黄、朱砂”“酸枣仁、茯苓”等,新处方包括“茯苓、炒酸枣仁、熟地黄、五味子、丹参、麦冬、生地黄”等。结论:安神类中成药处方药物多具有养血定志,补气滋阴和重镇安神之功效。%Objective:To explore composition rules of TCM tranquilizer prescriptions.Methods:The tranquilizer prescriptions in“The New National Medicine”were collected to build a database based on traditional Chinese medicine inheritance assist system. The methods of association rules with apriori algorithm and complex system entropy cluster were used to achieve the frequency of medicines and association rules between drugs.Results:The data-mining results indicated that in the tranquilizer prescriptions,the highest frequently used drugs were Poria Cocos Wolff,Radix Glycyrrhizae,Angelica sinensis,Radix Ophiopogonis,Cinnabaris. The most frequent drug combinations were “Angelica sinensis,Poria Cocos Wolff”,“Poria Cocos Wolff,Parched Semen Ziziphi Spinosae”,“Radix Glycyrrhizae,Poria Cocos Wolff”.The drugs with a high degree confidence coefficient of association rules in-cluded “Calculus Bovis,Cinnabaris”,“Semen Ziziphi Spinosae,Poria Cocos Wolff”.The new prescriptions contained Poria Co-cos Wolff,Parched Semen Ziziphi Spinosae,Radix Rehmanniae Preparata,Fructus Schisandrae Chinensis,Radix Salviae Miltior-rhizae,Radix Ophiopogonis,and Radix Rehmanniae

  15. Habituation: a non-associative learning rule design for spiking neurons and an autonomous mobile robots implementation

    International Nuclear Information System (INIS)

    This paper presents a novel bio-inspired habituation function for robots under control by an artificial spiking neural network. This non-associative learning rule is modelled at the synaptic level and validated through robotic behaviours in reaction to different stimuli patterns in a dynamical virtual 3D world. Habituation is minimally represented to show an attenuated response after exposure to and perception of persistent external stimuli. Based on current neurosciences research, the originality of this rule includes modulated response to variable frequencies of the captured stimuli. Filtering out repetitive data from the natural habituation mechanism has been demonstrated to be a key factor in the attention phenomenon, and inserting such a rule operating at multiple temporal dimensions of stimuli increases a robot's adaptive behaviours by ignoring broader contextual irrelevant information. (paper)

  16. Habituation: a non-associative learning rule design for spiking neurons and an autonomous mobile robots implementation.

    Science.gov (United States)

    Cyr, André; Boukadoum, Mounir

    2013-03-01

    This paper presents a novel bio-inspired habituation function for robots under control by an artificial spiking neural network. This non-associative learning rule is modelled at the synaptic level and validated through robotic behaviours in reaction to different stimuli patterns in a dynamical virtual 3D world. Habituation is minimally represented to show an attenuated response after exposure to and perception of persistent external stimuli. Based on current neurosciences research, the originality of this rule includes modulated response to variable frequencies of the captured stimuli. Filtering out repetitive data from the natural habituation mechanism has been demonstrated to be a key factor in the attention phenomenon, and inserting such a rule operating at multiple temporal dimensions of stimuli increases a robot's adaptive behaviours by ignoring broader contextual irrelevant information. PMID:23385344

  17. Using Association Rules to Study the Co-evolution of Production & Test Code

    NARCIS (Netherlands)

    Lubsen, Z.; Zaidman, A.; Pinzger, M.

    2009-01-01

    Paper accepted for publication in the proceedings of the 6th International Working Conference on Mining Software Repositories (MSR 2009). Unit tests are generally acknowledged as an important aid to produce high quality code, as they provide quick feedback to developers on the correctness of their

  18. Security threats to data mining and analysis tools of TIA program

    OpenAIRE

    Swati Vashisht, Divya Singh, Bhanu Prakash Lohani

    2012-01-01

    Data mining is the process that attempts to discover patterns in large data sets. The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records i.e.cluster analysis, unusual records (anomaly detection) and dependencies association rule mining. This usually involves using database techniques such as spatial indexes. These patterns can then be seen as a kind of summary of the ...

  19. Lichens and mosses as monitors of industrial activity associated with uranium mining in Northern Ontario, Canada

    International Nuclear Information System (INIS)

    A modified X-ray fluorescence spectrometry technique allowed the detection of uranium in cryptograms with a detection limit of 0.5 to 1 μg U g-1 of plant material. The levels of five elements (Ti, Fe, Ni, Pb and U) in 109 lichen and 98 moss samples collected around two uranium mining communities in northeastern Ontario, Canada, are reported. Similar metal accumulation tendencies were observed for the pair of lichens, Cladonia rangiferina and C. mitis, and for the moss pair, Pleurozium schreberi and Dicranum spp. This interchangeability, combined with favourable availability, made the above species the most useful biological monitors. Inter-elemental content comparisons employing Pearson's linear correlation statistic indicated a strong positive association among the pairs iron/titanium, and uranium/lead. Somewhat weaker positive correlations were observed in the individual comparisons of uranium levels with iron, or titanium, or nickel content. The associations between elements in mosses and lichens were in excellent agreement with the grouping based on the composition of the local uranium ores and tailings. (author)

  20. Geomorphological changes associated with underground coal mining in the Fushun area, northeast China revealed by multitemporal satellite remote sensing data

    Energy Technology Data Exchange (ETDEWEB)

    Dong, Y.F.; Fu, B.H.; Ninomiya, Y. [China Earthquake Administration, Beijing (China). Inst. of Earthquake Science

    2009-07-01

    Fushun is a famous coal-mining city in northeastern China with more than 100 years of history. Long-term underground coal mining has caused serious surface subsidence in the eastern part of the city. In this study, multitemporal and multi-source satellite remote sensing data were used to detect subsidence and geomorphological changes associated with underground coal mining over a 10-year period (1996-2006). A digital elevation model (DEM) was generated through Synthetic Aperture Radar (SAR) interferometry processing using data from a pair of European Remote Sensing Satellite (ERS) SAR images acquired in 1996. In addition, a Shuttle Radar Topography Mission (SRTM) DEM obtained from data in 2000 and an Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) DEM from 2006 were used for this study. The multitemporal DEMs indicated that the maximum vertical displacement due to subsidence was around 13 m from 1996 to 2006. Multitemporal ASTER images showed that the flooded water area associated with subsidence had increased by 1.73 km{sup 2} over the same time period. Field investigations and ground level measurements confirmed that the results obtained from the multitemporal remote sensing data agreed well with ground truth data. This study demonstrates that DEMs derived from multisource satellite remote sensing data can provide a powerful tool to map geomorphological changes associated with underground mining activities.