Effects of machine learning errors on human decision-making: manipulations of model accuracy, error types, and error importance-Reference-Cited by-同舟云学术

Effects of machine learning errors on human decision-making: manipulations of model accuracy, error types, and error importance

Published:2024-08-26 Issue:1 Volume:9 Page:
ISSN:2365-7464
Container-title:Cognitive Research: Principles and Implications
language:en
Short-container-title:Cogn. Research

Author:

Matzen Laura E.^ORCID,Gastelum Zoe N.,Howell Breannan C.,Divis Kristin M.,Stites Mallory C.

Abstract

AbstractThis study addressed the cognitive impacts of providing correct and incorrect machine learning (ML) outputs in support of an object detection task. The study consisted of five experiments that manipulated the accuracy and importance of mock ML outputs. In each of the experiments, participants were given the T and L task with T-shaped targets and L-shaped distractors. They were tasked with categorizing each image as target present or target absent. In Experiment 1, they performed this task without the aid of ML outputs. In Experiments 2–5, they were shown images with bounding boxes, representing the output of an ML model. The outputs could be correct (hits and correct rejections), or they could be erroneous (false alarms and misses). Experiment 2 manipulated the overall accuracy of these mock ML outputs. Experiment 3 manipulated the proportion of different types of errors. Experiments 4 and 5 manipulated the importance of specific types of stimuli or model errors, as well as the framing of the task in terms of human or model performance. These experiments showed that model misses were consistently harder for participants to detect than model false alarms. In general, as the model’s performance increased, human performance increased as well, but in many cases the participants were more likely to overlook model errors when the model had high accuracy overall. Warning participants to be on the lookout for specific types of model errors had very little impact on their performance. Overall, our results emphasize the importance of considering human cognition when determining what level of model performance and types of model errors are acceptable for a given task.

Funder

Sandia National Laboratories

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1186/s41235-024-00586-2.pdf

Reference47 articles.

1. Bandlow, A., Jones, K. A., Brown, N. J. K., & Nozick, L. K. (2017). The impact of false and nuisance alarms on the design optimization of physical security systems. In I. Nunes (Ed.), Advances in human factors and system interactions. Advances in Intelligent Systems and Computing (Vol. 497, pp. 189–201). Springer, Cham. https://doi.org/10.1007/978-3-319-41956-5_18

2. Biggs, A. T., Kramer, M. R., & Mitroff, S. R. (2018). Using cognitive psychology research to inform professional visual search operations. Journal of Applied Research in Memory and Cognition, 7(2), 189–198.

3. Bruzzese, T., Gao, I., Dietz, G., Ding, C., & Romanos, A. (2020). Effect of confidence indicators on trust in AI-generated profiles. In Extended abstracts of the 2020 CHI conference on human factors in computing systems (pp. 1–8).

4. Cireşan, D., Meier, U., Masci, J., & Schmidhuber, J. (2011). A committee of neural networks for traffic sign classification. In The 2011 International joint conference on neural networks (pp. 1918–1921). IEEE.

5. Coppers, S., Van den Bergh, J., Luyten, K., Coninx, K., Van der Lek-Ciudin, I., Vanallemeersch, T., & Vandeghinste, V. (2018). Intellingo: An intelligible translation environment. In Proceedings of the 2018 CHI conference on human factors in computing systems (pp. 1–13).