Association Rule Discovery, a data mining technique, was formally introduced by Rakesh Agarwal, Ramakrishnan Srikant and Imielinski in the year 1993. The research on this technique was carried out under the “Quest Project at IBM Almaden Research Center” and at the “University of Helsinki”. The technique association rules works on the principle of finding relationships between transactions in a database. Since each transaction contains a set of items the algorithm specifies to find an item x such that every occurrence of x specifies occurrence of another item say y. Hence the association rule x=>y will hold true if the confidence and support for this rule is greater than the user specified minimum confidence (minconf) and minimum support (minsup). Support is counted as the percentage of transactions containing the itemset to the total number of transactions in the database. Confidence is the percentage of transactions containing an itemset to the number of transactions containing the subset of the itemset. Basically, there are two steps involved in finding out the association rules in the transactional databases. First is to generate the frequent k itemsets. Second is to generate the association rules between those itemsets which will hold true if they satisfy the minimum support and minimum confidence.
Association Rule Discovery is one of the most famous and widely accepted strategies of data mining that focuses on detecting interesting associations between items in the large databases. Commonly used for market basket analysis in order to determine the likely combinations of items that will appeal to a consumer based on prior records, it has also been used to create predictive association rules for classification problems. The information collected using Association Rule Discovery technique also help the companies in making decisions, forecasting sales, determining frauds, etc.
The paper below summarizes the basic methodology of association rules along with the mining association algorithms in the third section. The algorithms include the most basic Apriori algorithm along with other algorithms such as AprioriTid, AprioriHybrid, MiRABit, Inverted Hashing and Pruning and Perfect Hashing and Pruning. Fourth section of the paper includes two case studies which demonstrate the usefulness of association rule technique in the real world. The first case study relates to the continuous analysis of the market basket data in retail organizations in order to help the management to take decisions based on the associations revealed between various products. The second case study addresses the crucial problem of traffic safety in Belgium and explains the use of association rules as an effective data mining technique for outlining the main spots and zones which are highly prone to accidents in Belgium. This case study also use association rules to discover the discriminatory characteristics that exist between the high frequency accident locations and the low frequency accident locations. The fifth and the final section contains the conclusion for the paper.
Tuesday, September 30, 2008
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment