Abstract
Aim: To examine the clinical accuracy and applicability of ChatGPT answers to commonly asked questions from patients considering posterior lumbar decompression (PLD).
Methods: A literature review was conducted to identify 10 questions that encompass some of the most common questions and concerns patients may have regarding lumbar decompression surgery. The selected questions were then posed to ChatGPT. Initial responses were then recorded, and no follow-up or clarifying questions were permitted. Two attending fellowship-trained spine surgeons then graded each response from the chatbot using a modified Global Quality Scale to evaluate ChatGPT’s accuracy and utility. The surgeons then analyzed each question, providing evidence-based justifications for the scores.
Results: Minimum scores across all ten questions would lead to a total score of 20, whereas a maximum score would be 100. ChatGPT’s responses in this analysis earned a score of 59, just under an average score of 3, when evaluated by two attending spine surgeons. A score of 3 denoted a somewhat useful response of moderate quality, with some important information adequately discussed but some poorly discussed.
Conclusion: ChatGPT has the ability to provide broadly useful responses to common preoperative questions that patients may have when considering undergoing PLD. ChatGPT has excellent utility in providing background information to patients and in helping them become more informed about their pathology in general. However, it often lacks the specific patient context necessary to provide patients with personalized, accurate insights into their prognosis and medical options.