Efficient Bike-sharing Repositioning with Cooperative Multi-Agent Deep Reinforcement Learning

Author:

Jing Yao1,Guo Bin1,Liu Yan2,Zhang Daqing3,Zeghlache Djamal3,Yu Zhiwen4

Affiliation:

1. Northwestern Polytechnical University, China

2. Peking University, China

3. Télécom SudParis, France

4. Harbin Engineering University; Northwestern Polytechnical University, China

Abstract

As an emerging mobility-on-demand service, bike-sharing system (BSS) has spread all over the world by providing a flexible, cost-efficient, and environment-friendly transportation mode for citizens. Demand-supply unbalance is one of the main challenges in BSS because of the inefficiency of the existing bike repositioning strategy, which reallocates bikes according to a pre-defined periodic schedule without considering the highly dynamic user demands. While reinforcement learning has been used in some repositioning problems for mitigating demand-supply unbalance, there are significant barriers when extending it to BSS due to the dimension curse of action space resulting from the dynamic number of workers and bikes in the city. In this paper, we study these barriers and address them by proposing a novel bike repositioning system, namely BikeBrain, which consists of a demand prediction model and a spatio-temporal bike repositioning algorithm. Specifically, to obtain accurate and real-time usage demand for efficient bike repositioning, we first present a prediction model ST-NetPre, which directly predicts user demand considering the highly dynamic spatio-temporal characteristics. Furthermore, we propose a spatio-temporal cooperative multi-agent reinforcement learning method (ST-CBR) for learning the worker-based bike repositioning strategy in which each worker in BSS is considered an agent. Especially, ST-CBR adopts the centralized learning and decentralized execution way to achieve effective cooperation among large-scale dynamic agents based on Mean Field Reinforcement Learning (MFRL), while avoiding the huge dimension of action space. For dynamic action space, ST-CBR utilizes a SoftMax selector to select the specific action. Meanwhile, for the benefits and costs of agents’ operation, an efficient reward function is designed to seek an optimal control policy considering both immediate and future rewards. Extensive experiments are conducted based on large-scale real-world datasets, and the results have shown significant improvements of our proposed method over several state-of-the-art baselines on the demand-supply gap and operation cost measures.

Publisher

Association for Computing Machinery (ACM)

Reference52 articles.

1. Maruan Al-Shedivat Trapit Bansal Yuri Burda Ilya Sutskever Igor Mordatch and Pieter Abbeel. 2017. Continuous adaptation via meta-learning in nonstationary and competitive environments. arXiv preprint arXiv:1710.03641(2017). Maruan Al-Shedivat Trapit Bansal Yuri Burda Ilya Sutskever Igor Mordatch and Pieter Abbeel. 2017. Continuous adaptation via meta-learning in nonstationary and competitive environments. arXiv preprint arXiv:1710.03641(2017).

2. Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

3. Bike flow prediction with multi-graph convolutional networks

4. Dynamic Bicycle Dispatching of Dockless Public Bicycle-sharing Systems Using Multi-objective Reinforcement Learning

5. Longbiao Chen Zhihan Jiang Jiangtao Wang and Yasha Wang. 2019. Data-Driven Bike Sharing System Optimization: State of the Art and Future Opportunities.. In EWSN. 347–350. Longbiao Chen Zhihan Jiang Jiangtao Wang and Yasha Wang. 2019. Data-Driven Bike Sharing System Optimization: State of the Art and Future Opportunities.. In EWSN. 347–350.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3