Discovering Significant Sequential Patterns in Data Stream by an Efficient Two-Phase Procedure-Reference-Cited by-同舟云学术

Discovering Significant Sequential Patterns in Data Stream by an Efficient Two-Phase Procedure

Published:2022-12-13 Issue: Volume:2022 Page:1-23
ISSN:1563-5147
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Tang Huijun¹²^ORCID,Wang Le²,Liu Yangguang²,Qian Jiangbo¹

Affiliation:

1. Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China

2. Faculty of Finance and Information, Ningbo University of Finance & Economics, Ningbo 315175, China

Abstract

One essential topic of mining sequential patterns in the data stream is to optimize the time-space computations. However, more importantly, it should pay more attention to the significance of mining results as a large portion of them just response to the user-defined constraints purely by accident and they may have no statistical significance. In this paper, we propose FSSPDS, an efficient two-phase algorithm to discover the significant sequential patterns (SSPs) in the data stream with typical sliding windows, which has never been considered in existing problems. First, for generating SSPs candidates with high-quality, FSSPDS takes testable support and pattern length constraints into account and insignificant patterns were removed timely by a pattern-growth method. In the second phase, appropriate permutation testing is used to test the significance of the SSPs candidates. Exact permutation

p

values are obtained in a novel combination way based on unconditional Barnard’s test statistic which better reflects the process of data generations and collections. Experimental evaluations show that FSSPDS allows the discovery of SSPs in the data stream and rivals the state-of-the-art approaches efficiently under the control of family-wise error rate (FWER), especially for time efficiency, which was approximately an order of magnitude higher.

Funder

Chinese National Funding of Social Sciences

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2022/5379086.pdf

Reference42 articles.

1. A survey of sequential pattern mining;P. Fournier-Viger;Data Science and Pattern Recognition,2017

2. Fast vertical mining of sequential patterns using Co-occurrence information;P. Fournier-Viger

3. Fast Utility Mining on Sequence Data

4. NegPSpan: efficient extraction of negative sequential patterns with embedding constraints

5. A tutorial on statistically sound pattern discovery

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Probabilistic Support Prediction: Fast Frequent Itemset Mining in Dense Data;IEEE Access;2024