Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality

Author:

Li Chenyang12,Zhang Long12,Zheng Qiusheng12

Affiliation:

1. The Frontier Information Technology Research Institute, Zhongyuan University of Technology, Zhengzhou 450007, China

2. Henan Key Laboratory on Public Opinion Intelligent Analysis, Zhengzhou 450007, China

Abstract

Diffusion models have achieved tremendous success in modeling continuous data modalities, such as images, audio, and video, yet their application in discrete data domains (e.g., natural language) has been limited. Existing methods primarily represent discrete text in a continuous diffusion space, incurring significant computational overhead during training and resulting in slow sampling speeds. This paper introduces LaDiffuSeq, a latent diffusion-based text generation model incorporating an encoder–decoder structure. Specifically, it first employs a pretrained encoder to map sequences composed of attributes and corresponding text into a low-dimensional latent vector space. Then, without the guidance of a classifier, it performs the diffusion process for the sequence’s corresponding latent space. Finally, a pretrained decoder is used to decode the newly generated latent vectors, producing target texts that are relevant to themes and possess multiple emotional granularities. Compared to the benchmark model, DiffuSeq, this model achieves BERTScore improvements of 0.105 and 0.009 on two public real-world datasets (ChnSentiCorp and a debate dataset), respectively; perplexity falls by 3.333 and 4.562; and it effectively quadruples the text generation sampling speed.

Publisher

MDPI AG

Reference59 articles.

1. Language models are unsupervised multitask learners;Radford;OpenAI Blog,2019

2. Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y.N. (2017, January 6–11). Convolutional sequence to sequence learning. Proceedings of the International Conference on Machine Learning, Sydney, Australia.

3. Attention is all you need;Vaswani;Adv. Neural Inf. Process. Syst.,2017

4. Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv.

5. Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover;Zubiaga;PeerJ Comput. Sci.,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3