Abstract
Abstract
Background
One of the challenges in large-scale information retrieval (IR) is developing fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is mapping new questions to formerly answered questions that are “similar”.
Results
We propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare logistic regression and deep learning methods for RQE using different kinds of datasets including textual inference, question similarity, and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources which we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score.
Conclusions
The evaluation results support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference64 articles.
1. Russell-Rose T, Chamberlain J. Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals. JMIR Med Inform. 2017; 5(4):e33. Available from:
http://medinform.jmir.org/2017/4/e33/
. Accessed 15 Oct 2019.
2. Ben Abacha A, Demner-Fushman D. Recognizing Question Entailment for Medical Question Answering. In: AMIA 2016, American Medical Informatics Association Annual Symposium, Chicago, IL, USA, November 12-16, 2016: 2016. Available from:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5333286
. Accessed 15 Oct 2019.
3. Luo J, Zhang GQ, Wentz S, Cui L, Xu R. SimQ: Real-Time Retrieval of Similar Consumer Health Questions. J Med Internet Res. 2015; 17(2):e43. Available from:
https://doi.org/10.2196/jmir.3388
.
4. Nakov P, Màrquez L, Moschitti A, Magdy W, Mubarak H, Freihat AA, et al.SemEval-2016 Task 3: Community Question Answering; 2016. pp. 525–45. Available from:
http://aclweb.org/anthology/S/S16/S16-1083.pdf
. Accessed 15 Oct 2019.
5. Nakov P, Hoogeveen D, Màrquez L, Moschitti A, Mubarak H, Baldwin T, et al.SemEval-2017 Task 3: Community Question Answering. In: Proceedings of the 11th International Workshop on Semantic Evaluation SemEval ’17. Vancouver, Canada: Association for Computational Linguistics: 2017.
https://doi.org/10.18653/v1/s17-2003
.
Cited by
56 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献