BACKGROUND
Under the paradigm of Precision Medicine (PM), patients with the same disease can receive different personalized therapies according to their clinical and genetic features. These therapies are determined by the totality of all available clinical evidence, including results from case reports, clinical trials and systematic reviews. However, it is increasingly difficult for physicians to find such evidence from scientific publications, whose size is growing at an unprecedented pace.
OBJECTIVE
In this work, we propose the PM-Search system to facilitate the retrieval of clinical literature that contains critical evidence for or against giving specific therapies to certain cancer patients.
METHODS
The PM-Search system combines a Baseline Retriever that selects document candidates at large scale and an Evidence Re-ranker that finely reorders the candidates based on their evidence quality. The Baseline Retriever uses query expansion and keyword matching with the Elasticsearch retrieval engine, and the Evidence Re-ranker fits pre-trained language models to expert annotations that are derived from an active learning strategy.
RESULTS
The PM-Search system achieves the best performance in the retrieval of high-quality clinical evidence at the TREC PM Track 2020, outperforming the second-ranking systems by large margins (0.4780 v.s. 0.4238 for standard NDCG@30 and 0.4519 v.s. 0.4193 for exponential NDCG@30).
CONCLUSIONS
We present PM-Search, a state-of-the-art search engine to assist the practicing of evidence-based PM. PM-Search uses a novel BioBERT-based active learning strategy that models evidence quality and improves the model performance. Our analyses show that evidence quality is a distinct aspect from the general relevance, and specific modeling of evidence quality beyond general relevance is required for a PM search engine.