#include <MSApriori_Trie.hpp>
Collaboration diagram for MSApriori_Trie:
Public Member Functions | |
MSApriori_Trie (const unsigned long counter_of_emptyset, const vector< double > &mis_abs) | |
void | insert_frequent_items (const set< pair< itemtype, unsigned long > > &counters) |
Insert the frequent items and their counters into the trie;. | |
void | candidate_generation (const itemtype &frequent_size) |
Generates candidates. | |
void | find_candidate (const vector< itemtype > &basket, const itemtype candidate_size, const unsigned long counter=1) |
Increases the counter of those candidates that are contained by the given basket. | |
void | delete_infrequent (const itemtype candidate_size) |
Deletes unfrequent itemsets. | |
void | association (const double min_conf, Input_Output_Manager &input_output_manager) const |
Generates association rules. | |
itemtype | longest_path () const |
Returns the length of the longest path in the Apriori_Trie. | |
void | write_content_to_file (Input_Output_Manager &input_output_manager) const |
Writes the content (frequent itemsets) to the file. | |
void | show_content_preorder () const |
This procedure shows the Apriori_Trie in a preorde. | |
~MSApriori_Trie () | |
Protected Member Functions | |
bool | is_all_subset_frequent (const set< itemtype > &maybe_candidate) const |
Decides if all subset of an itemset is contained in the Apriori_Trie. | |
void | candidate_generation_two () |
Generates candidate of size two. | |
void | candidate_generation_assist (Trie *Trie, const itemtype distance_from_generator, set< itemtype > &maybe_candidate) |
Generates candidate of size more than two. | |
void | find_candidate_two (const vector< itemtype > &basket, const unsigned long counter=1) |
Increases the counter for those itempairs that are in the given basket. | |
void | delete_infrequent_two () |
Deletes the Tries that represent infrequent itemsets of size 2. | |
void | assoc_rule_find (const double min_conf, set< itemtype > &condition_part, set< itemtype > &consequence_part, const unsigned long union_support, Input_Output_Manager &input_output_manager) const |
void | assoc_rule_assist (const double min_conf, const Trie *Trie, set< itemtype > &consequence_part, Input_Output_Manager &input_output_manager) const |
void | write_content_to_file_assist (Input_Output_Manager &input_output_manager, const Trie *actual_state, const itemtype distance_from_frequent, set< itemtype > &frequent_itemset) const |
Writes out the content of the Apriori_Trie (frequent itemset and counters). | |
Protected Attributes | |
Trie | main_trie |
Trie to store the candidates and the frequent itemsets. | |
vector< vector< unsigned long > > | temp_counter_array |
temp_counter_array stores the occurences of the itempairs | |
vector< double > | mis_abs |
The mis values. |
Apriori_Trie is a special tree designed to provide efficient methods for the apriori algorithm. It mostly uses a regular trie except when there exist faster solution. For example for storing one and two itemset candidate where a simple vector and array gives better performance. Apriori_Trie extends the functions provided by the regular trie with a candidate generation process.
Definition at line 32 of file MSApriori_Trie.hpp.
|
Definition at line 20 of file MSApriori_Trie.cpp. |
|
Definition at line 114 of file MSApriori_Trie.cpp. |
|
Definition at line 264 of file MSApriori_Trie.cpp. References assoc_rule_find(), Trie::counter, and Trie::edgevector. Referenced by association(). |
|
Definition at line 240 of file MSApriori_Trie.cpp. References Trie::counter, Trie::is_included(), itemtype, main_trie, and Input_Output_Manager::write_out_basket(). Referenced by assoc_rule_assist(). |
|
Generates association rules.
Definition at line 84 of file MSApriori_Trie.cpp. References assoc_rule_assist(), and main_trie. Referenced by MSApriori::MSAPRIORI_alg(). |
|
Generates candidates.
Definition at line 40 of file MSApriori_Trie.cpp. References candidate_generation_assist(), candidate_generation_two(), itemtype, main_trie, and Trie::maxpath. Referenced by MSApriori::MSAPRIORI_alg(). |
|
Generates candidate of size more than two.
Definition at line 181 of file MSApriori_Trie.cpp. References Trie::add_empty_state(), Trie::edgevector, is_all_subset_frequent(), itemtype, Trie::max_path_set(), and Trie::maxpath. Referenced by candidate_generation(). |
|
Generates candidate of size two.
Definition at line 165 of file MSApriori_Trie.cpp. References main_trie, Trie::maxpath, mis_abs, and temp_counter_array. Referenced by candidate_generation(). |
|
Deletes unfrequent itemsets.
Definition at line 68 of file MSApriori_Trie.cpp. References delete_infrequent_two(), Trie::edgevector, itemtype, main_trie, and mis_abs. Referenced by MSApriori::MSAPRIORI_alg(). |
|
Deletes the Tries that represent infrequent itemsets of size 2. temp_counter_array[stateIndex_1] will never be used again! temp_counter_array will never be used again! Definition at line 218 of file MSApriori_Trie.cpp. References Edge_point_less(), Trie::edgevector, main_trie, Trie::max_path_set(), mis_abs, and temp_counter_array. Referenced by delete_infrequent(). |
|
Increases the counter of those candidates that are contained by the given basket.
Definition at line 57 of file MSApriori_Trie.cpp. References Trie::find_candidate(), find_candidate_two(), itemtype, and main_trie. Referenced by MSApriori::support(). |
|
Increases the counter for those itempairs that are in the given basket.
Definition at line 152 of file MSApriori_Trie.cpp. References temp_counter_array. Referenced by find_candidate(). |
|
Insert the frequent items and their counters into the trie;.
Definition at line 30 of file MSApriori_Trie.cpp. References Trie::add_empty_state(), Trie::edgevector, main_trie, and Trie::maxpath. Referenced by MSApriori::MSAPRIORI_alg(). |
|
Decides if all subset of an itemset is contained in the Apriori_Trie.
Definition at line 122 of file MSApriori_Trie.cpp. References Trie::is_included(), main_trie, and mis_abs. Referenced by candidate_generation_assist(). |
|
Returns the length of the longest path in the Apriori_Trie.
Definition at line 91 of file MSApriori_Trie.cpp. References itemtype, main_trie, and Trie::maxpath. Referenced by MSApriori::MSAPRIORI_alg(). |
|
This procedure shows the Apriori_Trie in a preorde.
Definition at line 108 of file MSApriori_Trie.cpp. References main_trie, and Trie::show_content_preorder(). |
|
Writes the content (frequent itemsets) to the file.
Definition at line 96 of file MSApriori_Trie.cpp. References Trie::counter, itemtype, main_trie, Trie::maxpath, and write_content_to_file_assist(). Referenced by MSApriori::MSAPRIORI_alg(). |
|
Writes out the content of the Apriori_Trie (frequent itemset and counters).
Definition at line 278 of file MSApriori_Trie.cpp. References Trie::counter, Trie::edgevector, itemtype, and Input_Output_Manager::write_out_basket_and_counter(). Referenced by write_content_to_file(). |
|
Trie to store the candidates and the frequent itemsets.
Definition at line 98 of file MSApriori_Trie.hpp. Referenced by assoc_rule_find(), association(), candidate_generation(), candidate_generation_two(), delete_infrequent(), delete_infrequent_two(), find_candidate(), insert_frequent_items(), is_all_subset_frequent(), longest_path(), show_content_preorder(), and write_content_to_file(). |
|
The mis values. mis_abs[i] strores the mis value of item i. mis_abs vector is sorted, because eacs item gets a new code so that the item withe the smallest mis value has the smallest code. This is for the sake of efficiency, and candidate generation simplicity. Definition at line 113 of file MSApriori_Trie.hpp. Referenced by candidate_generation_two(), delete_infrequent(), delete_infrequent_two(), and is_all_subset_frequent(). |
|
temp_counter_array stores the occurences of the itempairs We can use a simple array to determine the support of itemset of size two. This requires less memory than the trie-based supportcount. temp_counter_array[i][j-i] stores the occurence of the itempair (i,j). Definition at line 106 of file MSApriori_Trie.hpp. Referenced by candidate_generation_two(), delete_infrequent_two(), and find_candidate_two(). |