Semantic features analysis for biomedical lexical answer type prediction using ensemble learning approach-Reference-Cited by-同舟云学术

Semantic features analysis for biomedical lexical answer type prediction using ensemble learning approach

Published:2024-04-25 Issue:8 Volume:66 Page:5003-5019
ISSN:0219-1377
Container-title:Knowledge and Information Systems
language:en
Short-container-title:Knowl Inf Syst

Author:

Hussain Fiza Gulzar,Wasim Muhammad,Cheema Sehrish Munawar,Pires Ivan Miguel

Abstract

AbstractLexical answer type prediction is integral to biomedical question–answering systems. LAT prediction aims to predict the expected answer’s semantic type of a factoid or list-type biomedical question. It also aids in the answer processing stage of a QA system to assign a high score to the most relevant answers. Although considerable research efforts exist for LAT prediction in diverse domains, it remains a challenging biomedical problem. LAT prediction for the biomedical field is a multi-label classification problem, as one biomedical question might have more than one expected answer type. Achieving high performance on this task is challenging as biomedical questions have limited lexical features. One biomedical question must be assigned multiple labels given these limited lexical features. In this paper, we develop a novel feature set (lexical, noun concepts, verb concepts, protein–protein interactions, and biomedical entities) from these lexical features. Using ensemble learning with bagging, we use the label power set transformation technique to classify multi-label. We evaluate the integrity of our proposed methodology on the publicly available multi-label biomedical questions dataset (MLBioMedLAT) and compare it with twelve state-of-the-art multi-label classification algorithms. Our proposed method attains a micro-F1 score of 77%, outperforming the baseline model by 25.5%.

Funder

Universidade de Aveiro

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10115-024-02113-7.pdf

Reference45 articles.

1. Shortliffe EH, Chiang MF (2021) Biomedical data: their acquisition, storage, and use. Biomedical informatics: computer applications in health care and biomedicine. Springer, Cham, pp 45–75

2. Jin Q, Yuan Z, Xiong G, Yu Q, Ying H, Tan C, Chen M, Huang S, Liu X, Yu S (2022) Biomedical question answering: a survey of approaches and challenges. ACM Comput Surv (CSUR) 55(2):1–36

3. Antoniou C, Bassiliades N (2022) A survey on semantic question answering systems. Knowl Eng Rev 37:2

4. Li X, Roth D (2002) Learning question classifiers. In: COLING 2002: the 19th international conference on computational Linguistics

5. Neves M, Kraus M (2016) Biomedlat corpus: annotation of the lexical answer type for biomedical questions. In: Proceedings of the open knowledge base and question answering workshop (OKBQA 2016), pp 49–58