A Reinforcement Learning Approach to Optimize Discount and Reputation Tradeoffs in E-commerce Systems-Reference-Cited by-同舟云学术

A Reinforcement Learning Approach to Optimize Discount and Reputation Tradeoffs in E-commerce Systems

Published:2020-11-08 Issue:4 Volume:20 Page:1-26
ISSN:1533-5399
Container-title:ACM Transactions on Internet Technology
language:en
Short-container-title:ACM Trans. Internet Technol.

Author:

Xie Hong¹,Li Yongkun²,Lui John C. S.³

Affiliation:

1. Chongqing University, Shazhengjie, Shapingba, Chongqing, China

2. University of Science and Technology of China, Hefei, Anhui, China

3. The Chinese University of Hong, Hong Kong SAR

Abstract

Feedback-based reputation systems are widely deployed in E-commerce systems. Evidence shows that earning a reputable label (for sellers of such systems) may take a substantial amount of time, and this implies a reduction of profit. We propose to enhance sellers’ reputation via price discounts. However, the challenges are as follows: (1) The demands from buyers depend on both the discount and reputation, and (2) the demands are unknown to the seller. To address these challenges, we first formulate a profit maximization problem via a semi-Markov decision process to explore the optimal tradeoffs in selecting price discounts. We prove the monotonicity of the optimal profit and optimal discount. Based on the monotonicity, we design a Q-learning with forward projection (QLFP) algorithm, which infers the optimal discount from historical transaction data. We prove that the QLFP algorithm convergences to the optimal policy. We conduct trace-driven simulations using a dataset from eBay to evaluate the QLFP algorithm. Evaluation results show that QLFP improves the profit by as high as 50% over both Q-learning and Speedy Q-learning. The QLFP algorithm also improves both the reputation and profit by as high as two times over the scheme of not providing any price discount.

Funder

GRF

Chongqing High-Technology Innovation and Application Development Funds

National Nature Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Link

https://dl.acm.org/doi/pdf/10.1145/3400024

Reference35 articles.

1. Mohammad Gheshlaghi Azar Remi Munos Mohammad Ghavamzadeh and Hilbert Kappen. 2011. Speedy Q-learning. In Advances in Neural Information Processing Systems. Mohammad Gheshlaghi Azar Remi Munos Mohammad Ghavamzadeh and Hilbert Kappen. 2011. Speedy Q-learning. In Advances in Neural Information Processing Systems.

2. Evidence of the Effect of Trust Building Technology in Electronic Markets: Price Premiums and Buyer Behavior

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Consumer evaluation mechanisms on e-commerce platforms: reputation management and analysis of influencing factors;Applied Mathematics and Nonlinear Sciences;2024-01-01

2. E-Commerce: Reach Customers and Drive Sales with Data Science and Big Data Analytics;2023 2nd International Conference for Innovation in Technology (INOCON);2023-03-03

3. Chinese Emotional Dialogue Response Generation via Reinforcement Learning;ACM Transactions on Internet Technology;2021-07-22