Dynamic path learning in decision trees using contextual bandits-Reference-Cited by-同舟云学术

Dynamic path learning in decision trees using contextual bandits

Published:2022-04-20 Issue:1 Volume:26 Page:271-296
ISSN:1386-145X
Container-title:World Wide Web
language:en
Short-container-title:World Wide Web

Author:

Ju Weiyu^ORCID,Yuan Dong,Bao Wei,Ge Liming,Zhou Bing Bing

Abstract

AbstractWe present a novel online decision-making solution, where the optimal path of a given decision tree is dynamically found based on the contextual bandits analysis. At each round, the learner finds a path in the decision tree by making a sequence of decisions following the tree structure and receives an outcome when a terminal node is reached. At each decision node, the environment information is observed to hint on which child node to visit, resulting in a better outcome. The objective is to learn the context-specific optimal decision for each decision node to maximize the accumulated outcome. In this paper, we propose Dynamic Path Identifier (DPI), a learning algorithm where the contextual bandit is applied to every decision node, and the observed outcome is used as the reward of the previous decisions of the same round. The technical difficulty of DPI is the high exploration challenge caused by the width (i.e., the number of paths) of the tree as well as the large context space. We mathematically prove that DPI’s regret per round approached zero as the number of the rounds approaches infinity. We also prove that the regret is not a function of the number of paths in the tree. Numerical evaluations are provided to complement the theoretical analysis.

Funder

Australian Research Council

University of Sydney

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Software

Link

https://link.springer.com/content/pdf/10.1007/s11280-022-01043-0.pdf

Reference42 articles.

1. Magee, J.F.: Decision trees for decision making. Harvard Business Review, Boston (1964)

2. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press, United States (1984)

3. Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics 21(3), 660–674 (1991)

4. Zhang, S.: Multiple-scale cost sensitive decision tree learning. World Wide Web 21(6), 1787–1800 (2018)

5. Huntley, N., Troffaes, M.: Normal form backward induction for decision trees with coherent lower previsions. Annals of Operations Research, 195 (2011). https://doi.org/10.1007/s10479-011-0968-2