Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making

Author:

Inkpen Kori1ORCID,Chappidi Shreya2ORCID,Mallari Keri3ORCID,Nushi Besmira1ORCID,Ramesh Divya4ORCID,Michelucci Pietro5ORCID,Mandava Vani1ORCID,Vepřek Libuše Hannah6ORCID,Quinn Gabrielle7ORCID

Affiliation:

1. Microsoft Research

2. University of Virginia

3. University of Washington

4. University of Michigan

5. Human Computation Institute

6. LMU Munich

7. Western Washington University

Abstract

Human-AI collaboration for decision-making strives to achieve team performance that exceeds the performance of humans or AI alone. However, many factors can impact success of Human-AI teams, including a user’s domain expertise, mental models of an AI system, trust in recommendations, and more. This article reports on a study that examines users’ interactions with three simulated algorithmic models, all with equivalent accuracy rates but each tuned differently in terms of true positive and true negative rates. Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled. Users completed 140 trials across multiple stages, first without an AI and then with recommendations from an AI-Assistant. Although all users had prior experience with the task, their levels of proficiency varied widely.Our results demonstrated that while recommendations from an AI-Assistant can aid in users’ decision making, several underlying factors, including user base expertise and complementary human-AI tuning, significantly impact the overall team performance. First, users’ base performance matters, particularly in comparison to the performance level of the AI. Novice users improved, but not to the accuracy level of the AI. Highly proficient users were generally able to discern when they should follow the AI recommendation and typically maintained or improved their performance. Mid-performers, who had a similar level of accuracy to the AI, were most variable in terms of whether the AI recommendations helped or hurt their performance. Second, tuning an AI algorithm to complement users’ strengths and weaknesses also significantly impacted users’ performance. For example, users in our study were better at detecting flowing blood vessels, so when the AI was tuned to reduce false negatives (at the expense of increasing false positives), users were able to reject those recommendations more easily and improve in accuracy. Finally, users’ perception of the AI’s performance relative to their own performance had an impact on whether users’ accuracy improved when given recommendations from the AI. Overall, this work reveals important insights on the complex interplay of factors influencing Human-AI collaboration and provides recommendations on how to design and tune AI algorithms to complement users in decision-making tasks.

Publisher

Association for Computing Machinery (ACM)

Subject

Human-Computer Interaction

Reference62 articles.

Cited by 12 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3