Abstract
AbstractThe key contribution of this paper is a theoretical framework to analyse humans’ decision-making strategies under uncertainty, and more specifically how human subjects manage the trade-off between information gathering (exploration) and reward seeking (exploitation) in particular active learning in a black-box optimization task. Humans’ decisions making according to these two objectives can be modelled in terms of Pareto rationality. If a decision set contains a Pareto efficient (dominant) strategy, a rational decision maker should always select the dominant strategy over its dominated alternatives. A distance from the Pareto frontier determines whether a choice is (Pareto) rational. The key element in the proposed analytical framework is the representation of behavioural patterns of human learners as a discrete probability distribution, specifically a histogram considered as a non-parametric estimate of discrete probability density function on the real line. Thus, the similarity between users can be captured by a distance between their associated histograms. This maps the problem of the characterization of humans’ behaviour into a space, whose elements are probability distributions, structured by a distance between histograms, namely the optimal transport-based Wasserstein distance. The distributional analysis gives new insights into human behaviour in search tasks and their deviations from Pareto rationality. Since the uncertainty is one of the two objectives defining the Pareto frontier, the analysis has been performed for three different uncertainty quantification measures to identify which better explains the Pareto compliant behavioural patterns. Beside the analysis of individual patterns Wasserstein has also enabled a global analysis computing the WST barycenters and performing k-means Wasserstein clustering.
Funder
Università degli Studi di Milano - Bicocca
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Artificial Intelligence
Reference56 articles.
1. Wilson, R.C., Bonawitz, E., Costa, V.D., Ebitz, R.B.: Balancing exploration and exploitation with information and randomization. Curr. Opin. Behav. Sci. 38, 49–56 (2020)
2. Wilson, R.C., Geana, A., White, J.M., Ludvig, E.A., Cohen, J.D.: Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143(6), 2074 (2014)
3. Gershman, S.J.: Deconstructing the human algorithms for exploration. Cognition. 173, 34–42 (2018)
4. Schulz, E., Gershman, S.J.: The algorithmic architecture of exploration in the human brain. Curr. Opin. Neurobiol. 55, 7–14 (2019)
5. Schulz, E., Tenenbaum, J.B., Reshef, D.N., Speekenbrink, M., Gershman, S.: Assessing the Perceived Predictability of Functions. In: CogSci, vol. 6 (2015, November)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献