Overview of Long-form Document Matching: Survey of Existing Models and Their Challenges-Reference-Cited by-同舟云学术

Overview of Long-form Document Matching: Survey of Existing Models and Their Challenges

Published:2022-01-01 Issue:1 Volume:2171 Page:012059
ISSN:1742-6588
Container-title:Journal of Physics: Conference Series
language:
Short-container-title:J. Phys.: Conf. Ser.

Author:

Cheng Yaokai,Chen Ruoyu,Yuan Xiaoguang,Yang Yuting,Jiang Shan,Yang Bo

Abstract

AbstractLong-form document matching is an important direction in the field of natural language processing and can be applied to tasks such as news recommendation and text clustering. However, long-form document matching suffers from noisiness and sparsity of semantic information in long text. Using short-form document matching methods on a long-form matching problem is not satisfactory. Long-form document matching has attracted the attention of researchers, who have proposed many effective methods. Methods for matching long texts can be divided into three categories: traditional bag-of-words-based models, traditional deep learning-based models, and pre-training-based models. This study reviews typical methods of long-form document matching, analyzes their advantages and disadvantages, and discusses possible future developments.

Publisher

IOP Publishing

Subject

Computer Science Applications,History,Education

Link

https://iopscience.iop.org/article/10.1088/1742-6596/2171/1/012059/pdf

Reference17 articles.

1. Learning deep structured semantic models for web search using clickthrough data;Huang,2013

2. A latent semantic model with the convolutional-pooling structure for information retrieval;Shen,2014

3. Convolutional neural network architectures for matching natural language sentence;Hu,2014

4. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval;Palangi;IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP),2016

5. A deep architecture for semantic matching with multiple positional sentence representations;Wan,2016

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research on Long Text Similarity Calculation Method Based on TextRank and BERT;2024 4th Asia Conference on Information Engineering (ACIE);2024-01-26

2. Hierarchical and Multiple-Perspective Interaction Network for Long Text Matching;IEEE Access;2024