Enhancing Answer Selection via Ad-Hoc Knowledge Extraction from Unstructured Web Texts-Reference-Cited by-同舟云学术

Enhancing Answer Selection via Ad-Hoc Knowledge Extraction from Unstructured Web Texts

Published:2023-05-13 Issue:06 Volume:33 Page:933-951
ISSN:0218-1940
Container-title:International Journal of Software Engineering and Knowledge Engineering
language:en
Short-container-title:Int. J. Soft. Eng. Knowl. Eng.

Author:

Gu Shengwei¹²^ORCID,Luo Xiangfeng¹,Wang Hao¹

Affiliation:

1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, P. R. China

2. School of Computer and Information Engineering, Chuzhou University, Chuzhou 239000, P. R. China

Abstract

Answer selection aims to identify the most relevant answers to a given question from a set of candidates. It is the fundamental component of intelligent question answering system. To improve performance, it gradually becomes an effective strategy to integrate external structured knowledge bases (KBs) into the answer selection model. Due to expensive cost of construction and maintenance of such KBs, these models are suffering from domain barriers and information incompleteness. In this paper, we propose a two-stage extraction–comprehension answer selection model, which can extract ad-hoc knowledge from unstructured web texts to enhance the performance of answer selection. For the extraction, two types of snippets are extracted from unstructured web pages and utilized as the source of ad-hoc knowledge. For the comprehension, a selective attention mechanism is employed to extract and integrate ad-hoc knowledge from multiple text snippets obtained in the first stage, which can enrich the representation of question–answer pairs and more accurately identify the correct answers. By incorporating ad-hoc knowledge extracted from both types of snippets, the proposed model achieves state-of-the-art results on two public available benchmark datasets. In particular, on WikiQA, in terms of the two evaluation metrics (mean average precision and mean reciprocal rank), it achieves 9.9[Formula: see text] and 8.4[Formula: see text] higher than the previous non-pretraining-based models, and 3.4[Formula: see text] and 3.2[Formula: see text] higher than the pretraining-based models.

Funder

Shanghai Outstanding Academic Leaders Plan

National Key Research and Development Program of China

National Natural Science Foundation of China

Shanghai Science and Technology Young Talents Sailing Program

Publisher

World Scientific Pub Co Pte Ltd

Subject

Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218194023500201

Reference43 articles.