SS-BERT: A Semantic Information Selecting Approach for Open-Domain Question Answering
-
Published:2023-04-03
Issue:7
Volume:12
Page:1692
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Fu Xuan1, Du Jiangnan2, Zheng Hai-Tao3ORCID, Li Jianfeng2, Hou Cuiqin2, Zhou Qiyu2, Kim Hong-Gee4
Affiliation:
1. Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China 2. Ping An Technology, Shenzhen 518000, China 3. Pengcheng Laboratory, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China 4. Dental College, Seoul National University, Seoul 08826, Republic of Korea
Abstract
Open-Domain Question Answering (Open-Domain QA) aims to answer any factoid questions from users. Recent progress in Open-Domain QA adopts the “retriever-reader” structure, which has proven effective. Retriever methods are mainly categorized as sparse retrievers and dense retrievers. In recent work, the dense retriever showed a stronger semantic interpretation than the sparse retriever. When training a dual-encoder dense retriever for document retrieval and reranking, there are two challenges: negative selection and a lack of training data. In this study, we make three major contributions to this topic: negative selection by query generation, data augmentation from negatives, and a passage evaluation method. We prove that the model performs better by focusing on false negatives and data augmentation in the Open-Domain QA passage rerank task. Our model outperforms other single dual-encoder rerankers over BERT-base and BM25 by 0.7 in MRR@10, achieving the highest Recall@50 and the max Recall@1000, which is restricted by the BM25 retrieval results.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference37 articles.
1. Green, B.F., Wolf, A.K., Chomsky, C., and Laughery, K. (1961, January 9–11). Baseball: An automatic question-answerer. Proceedings of the Western Joint IRE-AIEE-ACM Computer Conference, Los Angeles, CA, USA. 2. Woods, W.A. (1973, January 4–8). Progress in natural language understanding: An application to lunar geology. Proceedings of the National Computer Conference and Exposition, New York, NY, USA. 3. Mollá, D., and Vicedo, J.L. (2007, January 23–30). Question Answering in Restricted Domains:An Overview. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague, Czech Republic. 4. Chen, D., and Yih, W.T. (2020, January 5–10). Open-domain question answering. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, Online. 5. Chen, D., Fisch, A., Weston, J., and Bordes, A. (August, January 30). Reading Wikipedia to Answer Open-Domain Questions. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|