Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality-Reference-Cited by-同舟云学术

Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality

Published:2024-03-15 Issue:6 Volume:13 Page:1093
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Li Chenyang¹²,Zhang Long¹²,Zheng Qiusheng¹²

Affiliation:

1. The Frontier Information Technology Research Institute, Zhongyuan University of Technology, Zhengzhou 450007, China

2. Henan Key Laboratory on Public Opinion Intelligent Analysis, Zhengzhou 450007, China

Abstract

Diffusion models have achieved tremendous success in modeling continuous data modalities, such as images, audio, and video, yet their application in discrete data domains (e.g., natural language) has been limited. Existing methods primarily represent discrete text in a continuous diffusion space, incurring significant computational overhead during training and resulting in slow sampling speeds. This paper introduces LaDiffuSeq, a latent diffusion-based text generation model incorporating an encoder–decoder structure. Specifically, it first employs a pretrained encoder to map sequences composed of attributes and corresponding text into a low-dimensional latent vector space. Then, without the guidance of a classifier, it performs the diffusion process for the sequence’s corresponding latent space. Finally, a pretrained decoder is used to decode the newly generated latent vectors, producing target texts that are relevant to themes and possess multiple emotional granularities. Compared to the benchmark model, DiffuSeq, this model achieves BERTScore improvements of 0.105 and 0.009 on two public real-world datasets (ChnSentiCorp and a debate dataset), respectively; perplexity falls by 3.333 and 4.562; and it effectively quadruples the text generation sampling speed.

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/6/1093/pdf

Reference59 articles.

1. Language models are unsupervised multitask learners;Radford;OpenAI Blog,2019

2. Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y.N. (2017, January 6–11). Convolutional sequence to sequence learning. Proceedings of the International Conference on Machine Learning, Sydney, Australia.

3. Attention is all you need;Vaswani;Adv. Neural Inf. Process. Syst.,2017

4. Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv.

5. Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover;Zubiaga;PeerJ Comput. Sci.,2021