BanditProp: Bandit Selection of Review Properties for Effective Recommendation-Reference-Cited by-同舟云学术

BanditProp: Bandit Selection of Review Properties for Effective Recommendation

Published:2022-11-16 Issue:4 Volume:16 Page:1-19
ISSN:1559-1131
Container-title:ACM Transactions on the Web
language:en
Short-container-title:ACM Trans. Web

Author:

Wang Xi¹^ORCID,Ounis Iadh¹^ORCID,Macdonald Craig¹^ORCID

Affiliation:

1. University of Glasgow, Glasgow, UK

Abstract

Many recent recommendation systems leverage the large quantity of reviews placed by users on items. However, it is both challenging and important to accurately measure the usefulness of such reviews for effective recommendation. In particular, users have been shown to exhibit distinct preferences over different types of reviews (e.g., preferring longer versus shorter or recent versus old reviews), indicating that users might differ in their viewpoints on what makes the reviews useful. Yet, there have been limited studies that account for the personalised usefulness of reviews when estimating the users’ preferences. In this article, we propose a novel neural model, called BanditProp, which addresses this gap in the literature. It first models reviews according to both their content and associated properties (e.g., length, sentiment and recency). Thereafter, it constructs a multi-task learning (MTL) framework to model the reviews’ content encoded with various properties.In such an MTL framework, each task corresponds to producing recommendations focusing on an individual property. Next, we address the selection of the features from reviews with different review properties as a bandit problem using multinomial rewards. We propose a neural contextual bandit algorithm (i.e., ConvBandit) and examine its effectiveness in comparison to eight existing bandit algorithms in addressing the bandit problem. Our extensive experiments on two well-known Amazon and Yelp datasets show that BanditProp can significantly outperform one classic and six existing state-of-the-art recommendation baselines. Moreover, BanditProp using ConvBandit consistently outperforms the use of other bandit algorithms over the two used datasets. In particular, we experimentally demonstrate the effectiveness of our proposed customised multinomial rewards in comparison to binary rewards, when addressing our bandit problem.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Link

https://dl.acm.org/doi/pdf/10.1145/3532859

Reference65 articles.

1. Rishabh Agarwal, Chen Liang, Dale Schuurmans, and Mohammad Norouzi. 2019. Learning to generalize from sparse and underspecified rewards. In Proceedings of the ICLR.

2. Sanae Amani and Christos Thrampoulidis. 2021. UCB-based algorithms for multinomial logistic regression bandits. Retrieved from https://arxiv.org/abs/2103.11489.

3. Towards an Exhaustive Framework for Online Social Networks User Behaviour Modelling

4. Aspect Based Recommendations