Author:
Cheng Yaokai,Chen Ruoyu,Yuan Xiaoguang,Yang Yuting,Jiang Shan,Yang Bo
Abstract
AbstractLong-form document matching is an important direction in the field of natural language processing and can be applied to tasks such as news recommendation and text clustering. However, long-form document matching suffers from noisiness and sparsity of semantic information in long text. Using short-form document matching methods on a long-form matching problem is not satisfactory. Long-form document matching has attracted the attention of researchers, who have proposed many effective methods. Methods for matching long texts can be divided into three categories: traditional bag-of-words-based models, traditional deep learning-based models, and pre-training-based models. This study reviews typical methods of long-form document matching, analyzes their advantages and disadvantages, and discusses possible future developments.
Subject
Computer Science Applications,History,Education
Reference17 articles.
1. Learning deep structured semantic models for web search using clickthrough data;Huang,2013
2. A latent semantic model with the convolutional-pooling structure for information retrieval;Shen,2014
3. Convolutional neural network architectures for matching natural language sentence;Hu,2014
4. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval;Palangi;IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP),2016
5. A deep architecture for semantic matching with multiple positional sentence representations;Wan,2016
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献