Abstract
AbstractEvidence-based medicine relies on systematic reviews of randomized controlled trials and other clinical intervention studies to synthesize the latest and most valid estimates of treatment effects. However, conducting these systematic reviews is time-consuming and labor-intensive, requiring careful design of sensitive search queries and manual identification of relevant studies.New database architectures and natural language processing (NLP) techniques have recently emerged that may streamline the systematic review process. These new approaches allow fast and fuzzy searches using non-Boolean queries and automatically extracting meta-information from unstructured database texts.Our study compares the effectiveness of NLP-based literature searches within a new database structure to the yield of the Cochrane Database of Systematic Reviews study sets – currently the gold standard. We built a stand-alone, freely available, fuzzy-enabled elastic search database containing all 36 million PubMed-indexed entries. The database is daily synchronized with PubMed. We developed and validated reliable filters to identify randomized clinical trials and other clinical intervention studies and extract Population and Intervention-relevant subtext.Relevant subtexts were detected with a precision of 0.74, recall of 0.81, and F1-score of 0.77 for the Population subtext, and a precision of 0.70, recall of 0.71, and an F1-score of 0.70 for the Intervention subtext. We found that short, user-friendly, and approximate queries were valuable in rapidly identifying a list of included studies within a random set of Cochrane intervention reviews. In 90% of systematic reviews (27/30), the new search strategy missed no more than two of all included trials by Cochrane, yet keeping the total hits lower compared to a comparable PubMed keyword search (87%; 26/30). This identification suggests that NLP-based literature searches within a new database structure on top of PubMed can be a promising approach for conducting and updating aggregated clinical evidence more efficiently and effectively.
Publisher
Cold Spring Harbor Laboratory