Affiliation:
1. Department of Otolaryngology—Head & Neck Surgery Stanford University School of Medicine Stanford California U.S.A.
2. Department of Otolaryngology‐Head & Neck Surgery Geisel School of Medicine at Dartmouth Hanover New Hampshire U.S.A.
3. Division of Pediatric Otolaryngology, Department of Otolaryngology‐Head & Neck Surgery Stanford University School of Medicine Palo Alto California U.S.A.
4. Division of Rhinology and Skull Base Surgery, Department of Otolaryngology‐Head & Neck Surgery Mass Eye and Ear Boston Massachusetts U.S.A.
Abstract
ObjectivesTo investigate potential demographic bias in artificial intelligence (AI)‐based simulations of otolaryngology, residency selection committee (RSC) members tasked with selecting one applicant among candidates with varied racial, gender, and sexual orientations.MethodsThis study employed random sampling of simulated RSC member decisions using a novel Application Programming Interface (API) to virtually connect to OpenAI's Generative Pre‐Trained Transformers (GPT‐4 and GPT‐4o). Simulated RSC members with diverse demographics were tasked with ranking to match 1 applicant among 10 with varied racial, gender, and sexual orientations. All applicants had identical qualifications; only demographics of the applicants and RSC members were varied for each simulation. Each RSC simulation ran 1000 times. Chi‐square tests analyzed differences across categorical variables. GPT‐4o simulations additionally requested a rationale for each decision.ResultsSimulated RSCs consistently showed racial, gender, and sexual orientation bias. Most applicant pairwise comparisons showed statistical significance (p < 0.05). White and Black RSCs exhibited greatest preference for applicants sharing their own demographic characteristics, favoring White and Black female applicants, respectively, over others (all pairwise p < 0.001). Asian male applicants consistently received lowest selection rates. Male RSCs favored White male and female applicants, while female RSCs preferred LGBTQIA+, White and Black female applicants (all p < 0.05). High socioeconomic status (SES) RSCs favored White female and LGBTQIA+ applicants, while low SES RSCs favored Black female and LGBTQIA+ applicants over others (all p < 0.001). Results from the newest iteration of the LLM, ChatGPT‐4o, indicated evolved selection preferences favoring Black female and LGBTQIA+ applicants across all RSCs, with the rationale of prioritizing inclusivity given in >95% of such decisions.ConclusionUtilizing publicly available LLMs to aid in otolaryngology residency selection may introduce significant racial, gender, and sexual orientation bias. Potential for significant and evolving LLM bias should be appreciated and minimized to promote a diverse and representative field of future otolaryngologists in alignment with current workforce data.Level of EvidenceN/A Laryngoscope, 2024