On the big data processing algorithms for finding frequent sequences

Author:

Can Ali Burak1,Zaval Mounes12,Uzun‐Per Meryem13ORCID,Aktas Mehmet S.2ORCID

Affiliation:

1. BiletBank Research and Development Center Akdeniz PE‐TUR A.S. Istanbul Turkey

2. Computer Engineering Department Yildiz Technical University Istanbul Turkey

3. Computer Engineering Department Istanbul Health and Technology University Istanbul Turkey

Abstract

AbstractSequential pattern mining algorithms extract trendy sequence appearances inside ordered transactional datasets such as market basket datasets. There is a lack of research employing big data processing techniques to locate frequent sequences on large‐scale datasets. Furthermore, there is a need for optimized sequential pattern mining algorithms that run on ordered one‐dimensional sequences. We also observe a lack of sequential pattern search studies in the literature, where the focus is centered around multi‐dimensional data sequences. Existing approaches that deal with ordered one‐dimensional datasets suffer from scalability issues as the amount of data to be analyzed is enormous. This research investigates the big data processing techniques used to find frequent sequences in large‐scale datasets. It also proposes a scalable sequence pattern mining algorithm called Sequential Pattern Acquisition by Reducing Search Space (SPARSS) designed for distributed data processing systems that efficiently handle large datasets containing sequential one‐element data. It introduces a prototype implementation of SPARSS and provides information on the SPARSS's memory and time requirements, which were calculated as part of experimental studies on a real‐world dataset. The results confirm our expectations and demonstrate SPARSS's superior scalability and run‐time efficiency compared to other distributed algorithms.

Publisher

Wiley

Subject

Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software

Reference58 articles.

1. E‐commerce trends during COVID‐19 Pandemic;Bhatti A;IJFGCN,2020

2. A survey of sequential pattern mining;Fournier‐Viger P;Data Sci Pattern Recognit,2017

3. Sequential pattern mining -- approaches and algorithms

4. Scalable recommendation systems based on finding similar items and sequences

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. E-Ticaret Siteleri için Kampanya Otomasyonu Tasarlanması ve Geliştirilmesi;Orclever Proceedings of Research and Development;2023-12-31

2. Special issue on High‐Performance Computing Conference (BASARIM 2022);Concurrency and Computation: Practice and Experience;2023-10-16

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3