Affiliation:
1. Rutgers New Jersey Medical School Newark New Jersey USA
Abstract
AbstractPurposeSince its release in November 2022, Chat Generative Pre‐Trained Transformer 3.5 (ChatGPT), a complex machine learning model, has garnered more than 100 million users worldwide. The aim of this study is to determine how well ChatGPT can generate novel systematic review ideas on topics within spine surgery.MethodsChatGPT was instructed to give ten novel systematic review ideas for five popular topics in spine surgery literature: microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement. A comprehensive literature search was conducted in PubMed, CINAHL, EMBASE and Cochrane. The number of nonsystematic review articles and number of systematic review papers that had been published on each ChatGPT‐generated idea were recorded.ResultsOverall, ChatGPT had a 68% accuracy rate in creating novel systematic review ideas. More specifically, the accuracy rates were 80%, 80%, 40%, 70% and 70% for microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement, respectively. However, there was a 32% rate of ChatGPT generating ideas for which there were 0 nonsystematic review articles published. There was a 71.4%, 50%, 22.2%, 50%, 62.5% and 51.2% success rate of generating novel systematic review ideas, for which there were also nonsystematic reviews published, for microdiscectomy, laminectomy, spinal fusion, kyphoplasty, disc replacement and overall, respectively.ConclusionsChatGPT generated novel systematic review ideas at an overall rate of 68%. ChatGPT can help identify knowledge gaps in spine research that warrant further investigation, when used under supervision of an experienced spine specialist. This technology can be erroneous and lacks intrinsic logic; so, it should never be used in isolation.Level of EvidenceNot applicable.