Brain versus bot: Distinguishing letters of recommendation authored by humans compared with artificial intelligence

Author:

Preiksaitis Carl1ORCID,Nash Christopher2ORCID,Gottlieb Michael3ORCID,Chan Teresa M.4ORCID,Alvarez Al'ai1ORCID,Landry Adaira5ORCID

Affiliation:

1. Department of Emergency Medicine Stanford School of Medicine Stanford California USA

2. Department of Emergency Medicine Massachusetts General Hospital Boston Massachusetts USA

3. Department of Emergency Medicine Rush University Medical Center Chicago Illinois USA

4. Division of Emergency Medicine, Department of Medicine McMaster University Hamilton Ontario Canada

5. Department of Emergency Medicine Harvard Medical School Boston Massachusetts USA

Abstract

AbstractObjectivesLetters of recommendation (LORs) are essential within academic medicine, affecting a number of important decisions regarding advancement, yet these letters take significant amounts of time and labor to prepare. The use of generative artificial intelligence (AI) tools, such as ChatGPT, are gaining popularity for a variety of academic writing tasks and offer an innovative solution to relieve the burden of letter writing. It is yet to be determined if ChatGPT could aid in crafting LORs, particularly in high‐stakes contexts like faculty promotion. To determine the feasibility of this process and whether there is a significant difference between AI and human‐authored letters, we conducted a study aimed at determining whether academic physicians can distinguish between the two.MethodsA quasi‐experimental study was conducted using a single‐blind design. Academic physicians with experience in reviewing LORs were presented with LORs for promotion to associate professor, written by either humans or AI. Participants reviewed LORs and identified the authorship. Statistical analysis was performed to determine accuracy in distinguishing between human and AI‐authored LORs. Additionally, the perceived quality and persuasiveness of the LORs were compared based on suspected and actual authorship.ResultsA total of 32 participants completed letter review. The mean accuracy of distinguishing between human‐ versus AI‐authored LORs was 59.4%. The reviewer's certainty and time spent deliberating did not significantly impact accuracy. LORs suspected to be human‐authored were rated more favorably in terms of quality and persuasiveness. A difference in gender‐biased language was observed in our letters: human‐authored letters contained significantly more female‐associated words, while the majority of AI‐authored letters tended to use more male‐associated words.ConclusionsParticipants were unable to reliably differentiate between human‐ and AI‐authored LORs for promotion. AI may be able to generate LORs and relieve the burden of letter writing for academicians. New strategies, policies, and guidelines are needed to balance the benefits of AI while preserving integrity and fairness in academic promotion decisions.

Publisher

Wiley

Subject

Emergency Nursing,Education,Emergency Medicine

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3