Author:
Ghosal Tirthankar,Edithal Vignesh,Ekbal Asif,Bhattacharyya Pushpak,Chivukula Srinivasa Satya Sameer Kumar,Tsatsaronis George
Abstract
AbstractDetecting, whether a document contains sufficient new information to be deemed as novel, is of immense significance in this age of data duplication. Existing techniques for document-level novelty detection mostly perform at the lexical level and are unable to address the semantic-level redundancy. These techniques usually rely on handcrafted features extracted from the documents in a rule-based or traditional feature-based machine learning setup. Here, we present an effective approach based on neural attention mechanism to detect document-level novelty without any manual feature engineering. We contend that the simple alignment of texts between the source and target document(s) could identify the state of novelty of a target document. Our deep neural architecture elicits inference knowledge from a large-scale natural language inference dataset, which proves crucial to the novelty detection task. Our approach is effective and outperforms the standard baselines and recent work on document-level novelty detection by a margin of
$\sim$
3% in terms of accuracy.
Publisher
Cambridge University Press (CUP)
Subject
Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献