Abstract
Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA). The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE/ernie-gen.
Publisher
International Joint Conferences on Artificial Intelligence Organization
Cited by
47 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Analysis of LLMs for Educational Question Classification and Generation;Computers and Education: Artificial Intelligence;2024-09
2. Towards Vietnamese Question and Answer Generation: An Empirical Study;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-08-16
3. CIQA: A Coding Inspired Question Answering Model;Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval;2024-07-10
4. CPGA-BOT: A Customized Power Grid Assistant chatBOT Fine-Tuning in Large Language Model;2024 5th International Conference on Information Science, Parallel and Distributed Systems (ISPDS);2024-05-31
5. A fusion topology method for generating new equipment startup schemes for power grids;Frontiers in Energy Research;2024-05-30