AN EXPERIMENTAL EVALUATION OF DEEP NEURAL NETWORK MODEL PERFORMANCE FOR THE RECOGNITION OF CONTRADICTORY MEDICAL RESEARCH CLAIMS USING SMALL AND MEDIUM-SIZED CORPORA
-
Published:2021-12-31
Issue:
Volume:
Page:68-77
-
ISSN:0127-9084
-
Container-title:Malaysian Journal of Computer Science
-
language:
-
Short-container-title:MJCS
Author:
Yazi Fatin Syafiqah,Vong Wan-Tze,Raman Valliappan,Hui Then Patrick Hang,J Lunia Mukulraj
Abstract
Corpora come in various shapes and sizes and play an essential role in facilitating Natural Language Processing (NLP) tasks. However, the availability of corpora specialized for Evidence-Based Medicine (EBM) related tasks is limited. The study is aimed to discover how the size of a corpus influence the performance of our Deep Neural Network (DNN) model developed for contradiction detection in medical literature. We explored the potential of the EBM Summarizer corpus by Mollá and Santiago-Martínez, a medium-sized corpus to be used with our contradiction detection model. The dataset preparation involves the filtering of open-ended questions, duplicates of claims, and vague claims. As a result, two datasets were created with the claim input represented by sniptext in one dataset and longtext in the other. Experiments were conducted with varying numbers of hidden layers and units of the model using different datasets. The performance of the DNN model was recorded and compared with the result of using a small-sized corpus. It was found that the DNN model performance did not improve even after it was trained with a larger dataset derived from the medium-sized corpus. The factors may include the limitation of the DNN model itself and the quality of the datasets.
Publisher
Univ. of Malaya
Subject
General Computer Science