This program is a very efficient implementation of MSAPRIORI algorithm proposed by Bing Liu, Wynne Hsu and Yiming Ma. MSAPRIORI is the most basic and well-known algorithm to find frequent itemsets with multiple minimum supports in a transactional database.
A transactional database consists of sequence of transaction: . A transaction is a set of items (). Transactions are often called baskets, referring to the primary application domain (i.e. market-basket analysis). A set of items is often called itemset by the data mining community. The (absolute) support or the occurrence of (denoted by ) is the number of transactions that are supersets of (i.e. that contain ). The realtive support is the absolute support divided by the number of transactions (i.e. n). An itemset is frequent if its support is greater or equal than its threshold value (mis(X)). If , then , where the mis values of the single items are predefined.
In the frequent itemset mining problem a transaction database and the mis values of the items (traditionally denoted by mis(i)) is given and we have to find all frequent itemsets.
This program is also capable of mining association rules. An association rule is like an implication: means that if itemset X occurs in a transaction, than itemset Y also occurs with high probability. This probability is given by the confidence of the rule. It is like an approxiamtion of p(Y|X), it is the number of transactions that contain both X and Y divided by the number of transaction that contain X, thus . The relative support of the association rule is the support of itemset . The lift of tries to capture the independence of the antecedent and the consequent of the rule: An association rule is valid if its confidence, support and lift are greater than or equal than corresponding threshold values.
In the association rule mining problem a transaction database and the mis values of the items (traditionally denoted by mis(i)), a confidence threshold (traditionally denoted by min_conf), and a lift threshold is given and we have to find all valid association rules.