It’s Not Always about Wide and Deep Models: Click-Through Rate Prediction with a Customer Behavior-Embedding Representation-Reference-Cited by-同舟云学术

It’s Not Always about Wide and Deep Models: Click-Through Rate Prediction with a Customer Behavior-Embedding Representation

Published:2024-01-12 Issue:1 Volume:19 Page:135-151
ISSN:0718-1876
Container-title:Journal of Theoretical and Applied Electronic Commerce Research
language:en
Short-container-title:JTAER

Author:

Alves Gomes Miguel¹^ORCID,Meyes Richard¹^ORCID,Meisen Philipp²^ORCID,Meisen Tobias¹^ORCID

Affiliation:

1. Institute for Technologies and Management of Digital Transformation, University of Wuppertal, 42119 Wuppertal, Germany

2. Breinify Inc., San Francisco, CA 94105, USA

Abstract

Alongside natural language processing and computer vision, large learning models have found their way into e-commerce. Especially, for recommender systems and click-through rate prediction, these models have shown great predictive power. In this work, we aim to predict the probability that a customer will click on a given recommendation, given only its current session. Therefore, we propose a two-stage approach consisting of a customer behavior-embedding representation and a recurrent neural network. In the first stage, we train a self-supervised skip-gram embedding on customer activity data. The resulting embedding representation is used in the second stage to encode the customer sequences which are then used as input to the learning model. Our proposed approach diverges from the prevailing trend of utilizing extensive end-to-end models for click-through rate prediction. The experiments, which incorporate a real-world industrial use case and a widely used as well as openly available benchmark dataset, demonstrate that our approach outperforms the current state-of-the-art models. Our approach predicts customers’ click intention with an average F1 accuracy of 94% for the industrial use case which is one percentage point higher than the state-of-the-art baseline and an average F1 accuracy of 79% for the benchmark dataset, which outperforms the best tested state-of-the-art baseline by more than seven percentage points. The results show that, contrary to current trends in that field, large end-to-end models are not always needed. The analysis of our experiments suggests that the reason for the performance of our approach is the self-supervised pre-trained embedding of customer behavior that we use as the customer representation.

Publisher

MDPI AG

Subject

Computer Science Applications,General Business, Management and Accounting

Link

https://www.mdpi.com/0718-1876/19/1/8/pdf

Reference70 articles.

1. Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Advances in Neural Information Processing Systems, Curran Associates, Inc.

2. Language Models are Few-Shot Learners;Larochelle;Advances in Neural Information Processing Systems,2020

3. Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., Chen, M., and Sutskever, I. (2021, January 18–24). Zero-Shot Text-to-Image Generation. Proceedings of the 38th International Conference on Machine Learning, PMLR, Virtual.

4. Cheng, H.T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., Anderson, G., Corrado, G., Chai, W., and Ispir, M. (2016, January 15). Wide & Deep Learning for Recommender Systems. Proceedings of the DLRS 2016 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA.

5. Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W., and Jiang, P. (2019, January 3–7). BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. Proceedings of the CIKM ’19 28th ACM International Conference on Information and Knowledge Management, Beijing, China.