Frequently Asked Questions and Association Rules - Apriori Algorithm

I'm trying to understand the basics of the Apriori (Basket) algorithm for use in data mining,

I’ll best explain what complication I have with an example:

Here is the transactional dataset:

t1: Milk, Chicken, Beer t2: Chicken, Cheese t3: Cheese, Boots t4: Cheese, Chicken, Beer t5: Chicken, Beer, Clothes, Cheese, Milk t6: Clothes, Beer, Milk t7: Beer, Milk, Clothes 

minsup for higher - 0.5 or 50%.

Taking from the above, my transaction number is clearly 7 , which means that the set of items will be "frequent", it should have a score of 4/7. So this was my Frequent Set 1:

F1:

 Milk = 4 Chicken = 4 Beer = 5 Cheese = 4 

Then I created my candidates for a second refinement (C2) and narrowed it down to:

F2:

 {Milk, Beer} = 4 

That's where I got confused if they ask me to display all the frequent elements, do I write down all F1 and F2 or just F2 ? F1 are not “sets” for me.

Then I will be asked to create association rules for the frequent sets of items that I just determined, and calculate their “confidence” numbers, I get the following:

 Milk -> Beer = 100% confidence Beer -> Milk = 80% confidence 

It seems unnecessary to place F1 itemsets here, since they will all be 100% trusted no matter what they are not “binding”, and that’s why I’m asking now if F1 really frequent?

+6
source share
2 answers

Elements with a size of 1 are considered frequent if their support is appropriate. But here you should consider the minimum threshold . for example, if your minimum threshold in your example is 2 , then F1 will not be considered. But if the minimum threshold is 1 , then you need to.

You can look here and here for more ideas and examples.

Hope i helped.

+2
source

If the minimum support threshold (minsup) is 4/7, then you should include individual elements in the set of frequent element sets if they are displayed from at least 4 transactions out of 7. Thus, in your example, you should include them

Milk = 4 Chicken = 4 Beer = 5 Cheese = 4

For association rules, they have the form X ==> Y, where X and Y are disjoint sets of elements, and it is usually assumed that X and Y are not an empty set (and this is what Apriori accepts). Therefore, to generate an association rule, you need at least two elements.

0
source

All Articles