<scp>TriMLP</scp> : A Foundational MLP-like Architecture for Sequential Recommendation-Reference-Cited by-同舟云学术

TriMLP : A Foundational MLP-like Architecture for Sequential Recommendation

Published:2024-06-10 Issue: Volume: Page:
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Jiang Yiheng¹^ORCID,Xu Yuanbo¹^ORCID,Yang Yongjian¹^ORCID,Yang Funing¹^ORCID,Wang Pengyang²^ORCID,Li Chaozhuo³^ORCID,Zhuang Fuzhen⁴^ORCID,Xiong Hui⁵^ORCID

Affiliation:

1. Lab of Mobile Intelligent Computing (MIC), College of Computer Science and Technology, Jilin University, China

2. Department of Computer and Information Science, The State Key Laboratory of Internet of Things for Smart City, University of Macau, China

3. Microsoft Research Asia, China

4. Institute of Artificial Intelligence, Beihang University, China

5. Thrust of Artificial Intelligence, Hong Kong University of Science and Technology (Guangzhou), China

Abstract

In this work, we present TriMLP as a foundational MLP-like architecture for the sequential recommendation, simultaneously achieving computational efficiency and promising performance. First, we empirically study the incompatibility between existing purely MLP-based models and sequential recommendation, that the inherent fully-connective structure endows historical user-item interactions (referred as tokens) with unrestricted communications and overlooks the essential chronological order in sequences. Then, we propose the MLP-based Triangular Mixer to establish ordered contact among tokens and excavate the primary sequential modeling capability under the standard auto-regressive training fashion. It contains (i) a global mixing layer that drops the lower-triangle neurons in MLP to block the anti-chronological connections from future tokens and (ii) a local mixing layer that further disables specific upper-triangle neurons to split the sequence as multiple independent sessions. The mixer serially alternates these two layers to support fine-grained preferences modeling, where the global one focuses on the long-range dependency in the whole sequence, and the local one calls for the short-term patterns in sessions. Experimental results on 12 datasets of different scales from 4 benchmarks elucidate that TriMLP consistently attains favorable accuracy/efficiency trade-off over all validated datasets, where the average performance boost against several state-of-the-art baselines achieves up to 14.88%, and the maximum reduction of inference time reaches 23.73%. The intriguing properties render TriMLP a strong contender to the well-established RNN-, CNN- and Transformer-based sequential recommenders. Code is available at https://github.com/jiangyiheng1/TriMLP .

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3670995

Reference66 articles.

1. Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR abs/1607.06450 (2016). arXiv:1607.06450 http://arxiv.org/abs/1607.06450

2. Contrastive Curriculum Learning for Sequential User Behavior Modeling via Data Augmentation

3. A Novel Macro-Micro Fusion Network for User Representation Learning on Mobile Apps

4. Deep Generative Imputation Model for Missing Not At Random Data

5. Shoufa Chen, Enze Xie, Chongjian Ge, Runjian Chen, Ding Liang, and Ping Luo. 2022. CycleMLP: A MLP-like Architecture for Dense Prediction. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net. https://openreview.net/forum?id=NMEceG4v69Y