Frequent pattern mining on big data using Apriori algorithm

Lakshminarayanan

doi:XX.XXX/IJARIIT-V3I5-1203

This paper is published in Volume 3, Issue 5, 2018

Paper Details
Abstract & PDF

Area

Big Data

Author

Lakshminarayanan

Org/Univ

Anna University, Chennai, Tamil Nadu, India

Pub. Date

28 May, 2018

Paper ID

V3I5-1203

Publisher

IJARnD

Edition

Volume 3, Issue 5, 2018

Keywords

Frequent pattern mining, Big data, Pruning, Support count, Confidence score, Map Reduce, Hadoop

Citations

IEEE
Lakshminarayanan. Frequent pattern mining on big data using Apriori algorithm, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARnD.com.

APA
Lakshminarayanan (2018). Frequent pattern mining on big data using Apriori algorithm. International Journal of Advance Research, Ideas and Innovations in Technology, 3(5) www.IJARnD.com.

MLA
Lakshminarayanan. "Frequent pattern mining on big data using Apriori algorithm." International Journal of Advance Research, Ideas and Innovations in Technology 3.5 (2018). www.IJARnD.com.

Give proper credits, use Citation.

Abstract

Frequent Pattern Mining is one of the most important tasks to extract meaningful and useful information from raw data. This task aims to extract item-sets that represent any type of homogeneity and regularity in data. Although many efficient algorithms have been developed in this regard, the growing interest in data has caused the performance of existing pattern mining techniques to be dropped. The goal of this paper is to propose new efficient pattern mining algorithms to work in big data. The existing pattern mining algorithms are based on homogeneity and regularity of data. With the dramatic increase on the scale of datasets collected and stored with cloud services in recent years, it takes more computation power for mining process in the cloud. Amount of work also transferred the approximate mining computation into the exact computation, where such methods not improve the accuracy also not enhance the efficiency. The proposed algorithm uses Hadoop distributed file server for frequent pattern mining. The Hadoop distributed file server improves the performance of the system. The Iterative apriori algorithm can be used to extract the frequent pattern from the dataset. In this approach, candidate itemsets are extracted from the initial dataset. The candidate itemsets are generated from the previous iteration. The support count is calculated for each candidate itemset. The support value is the frequency of items. The confidence value should be calculated for finding the dependency between itemsets. The threshold value is calculated and based on this value pruning is performed.

All content is copyright protected.