An algebraic semigroup method for discovering maximal frequent itemsets-Reference-Cited by-同舟云学术

An algebraic semigroup method for discovering maximal frequent itemsets

Published:2022-01-01 Issue:1 Volume:20 Page:1432-1443
ISSN:2391-5455
Container-title:Open Mathematics
language:en
Short-container-title:

Author:

Liu Jiang¹,Li Jing¹,Ni Feng¹,Xia Xiang¹,Li Shunlong¹,Dong Wenhui¹

Affiliation:

1. Department of Systems Science, University of Shanghai for Science and Technology , Shanghai 200093 , China

Abstract

Abstract Discovering maximal frequent itemsets is an important issue and key technique in many data mining problems such as association rule mining. In the literature, generating maximal frequent itemsets proves either to be NP-hard or to have

O ( l 3 4 l ( m + n ) )

O\left({l}^{3}{4}^{l}\left(m+n))

complexity in the worst case from the perspective of generating maximal complete bipartite graphs of a bipartite graph, where

are the item number and the transaction number, respectively, and

denotes the maximum of

∣ C ∣ ∣ Ψ ( C ) ∣ / ( ∣ C ∣ + ∣ Ψ ( C ) ∣ − 1 )

| C| | \Psi \left(C)| \hspace{0.1em}\text{/}\hspace{0.1em}\left(| C| +| \Psi \left(C)| -1)

, with the maximum taken over all maximal frequent itemsets

. In this article, we put forward a method for discovering maximal frequent itemsets, whose complexity is

O ( 3 m n 2 β + 4 β n )

O\left(3mn{2}^{\beta }+{4}^{\beta }n)

, lower than the known complexity both in the worst case, from the perspective of semigroup algebra, where

\beta

is the number of items whose support is more than the minimum support threshold. Experiments also show that an algorithm based on the algebraic method performs better than the other three well-known algorithms. Meanwhile, we explore some algebraic properties with respect to items and transactions, prove that the maximal frequent itemsets are exactly the simplified generators of frequent itemsets, give a necessary and sufficient condition for a maximal

i + 1

i+1

-frequent itemset being a subset of a closed

-frequent itemset, and provide a recurrence formula of maximal frequent itemsets.

Publisher

Walter de Gruyter GmbH

Subject

General Mathematics

Link

https://www.degruyter.com/document/doi/10.1515/math-2022-0516/pdf

Reference22 articles.

1. R. Agrawal, T. Imieliński, and A. Swami, Mining association rules between sets of items in large databases, ACM SIGMOD Record 22 (1993), no. 2, 207–216, https://doi.org/10.1145/170036.170072.

2. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, Fast Discovery of Association Rules: Advances in Knowledge Discovery and Data Mining, MIT Press, California, 1996, pp. 307–328.

3. J. Han and Y. Fu, Discovery of multiple-level association rules from large databases, in: VLDB ’95 Proceedings of the 21th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995, pp. 420–431.

4. W. Hwang and D. Kim, Improved association rule mining by modified trimming, in: The Sixth IEEE International Conference on Computer and Information Technology (CIT’06), IEEE Computer Society, Los Alamitos, CA, USA, 2006, pp. 24–24, https://doi.org/10.1109/CIT.2006.101.

5. H. Mannila, H. Toivonen, and A. I. Verkamo, Discovering frequent episodes in sequences, in: Proceedings of First ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), AAAI Press, Palo Alto, CA, USA, 1995, pp. 210–215.