Enhanced Frequent Itemsets Based on Topic Modeling in Information Filtering-Reference-Cited by-同舟云学术

Enhanced Frequent Itemsets Based on Topic Modeling in Information Filtering

Published:2017-10 Issue:4 Volume:5 Page:33-43
ISSN:2166-7160
Container-title:International Journal of Software Innovation
language:en
Short-container-title:

Author:

Wai Than Than¹,Aung Sint Sint¹

Affiliation:

1. University of Computer Studies, Mandalay, Myanmar

Abstract

In order to generate user's information needs from a collection of documents, many term-based and pattern-based approaches have been used in Information Filtering. In these approaches, the documents in the collection are all about one topic. However, user's interests can be diverse and the documents in the collection often involve multiple topics. Topic modeling is useful for the area of machine learning and text mining. It generates models to discover the hidden multiple topics in a collection of documents and each of these topics are presented by distribution of words. But its effectiveness in information filtering has not been so well explored. Patterns are always thought to be more discriminative than single terms for describing documents. The major challenge found in frequent pattern mining is a large number of result patterns. As the minimum threshold becomes lower, an exponentially large number of patterns are generated. To deal with the above mentioned limitations and problems, in this paper, a novel information filtering model, EFITM (Enhanced Frequent Itemsets based on Topic Model) model is proposed. Experimental results using the CRANFIELD dataset for the task of information filtering show that the proposed model outperforms over state-of-the-art models.

Publisher

IGI Global

Subject

Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Computer Science Applications,Software

Reference28 articles.

1. Topic based language models for ad hoc information retrieval

2. Mining frequent patterns with counting inference

3. Frequent term-based text clustering

4. Information filtering and information retrieval

5. Latent dirichlet allocation.;D. M.Blei;Journal of Machine Learning Research,2003

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On the modeling of cyber-attacks associated with social engineering: A parental control prototype;Journal of Information Security and Applications;2023-06