Investigation of the Batch Size Influence on the Quality of Text Generation by the SeqGAN Neural Network-Reference-Cited by-同舟云学术

Investigation of the Batch Size Influence on the Quality of Text Generation by the SeqGAN Neural Network

Published:2021 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 31th International Conference on Computer Graphics and Vision. Volume 2
language:
Short-container-title:

Author:

Krivosheev Nikolay¹^ORCID,Vik Ksenia¹²^ORCID,Ivanova Yulia¹^ORCID,Spitsyn Vladimir¹^ORCID

Affiliation:

1. Tomsk Polytechnic University

2. Tomsk State University of Architecture and Building

Abstract

One of the problems of text generation using the LSTM neural network is a decrease in the quality of generation with an increase in the length of the generated text. There are various solutions to improve the quality of text generation based on generative adversarial neural networks. This work uses preliminary training of the LSTM neural network based on the MLE approach and further training based on the SeqGAN neural network. Based on the presented results, we can conclude that the SeqGAN-based approach allows to increase the quality of text generation according to the NLL and BLEU metrics. The study of the influence of the batch size, in the process of competitive training of the SeqGAN neural network, on the quality of text generation has been carried out. It is shown that with an increase in the batch size, in the process of adversarial learning, the quality of LSTM neural network training increases. In this work, the Monte Carlo algorithm is not used in the training process of the SeqGAN neural network. For training and testing algorithms, image captions from the COCO Image Captions data sample are used. The quality of text generation based on the NLL and BLEU metrics has been assessed. Examples of the results of generating texts with an assessment of the quality of examples according to the BLEU metric are given,

Publisher

Keldysh Institute of Applied Mathematics

Reference11 articles.

1. S. Hochreiter, J. Schmidhuber, Long Short-Term Memory, Neural Computation 9(8) (1997) 1735–1780. doi: 10.1162/neco.1997.9.8.1735.

2. J.S. Cramer, Econometric Applications of Maximum Likelihood Methods, Cambridge University Press, 1986. doi: 10.1017/CBO9780511572050.

3. L. Yu, W. Zhang, J. Wang, Y. Yu, SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient, in: AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017, pp. 2852–2858. arXiv:1609.05473.

4. X. Chen, H. Fang, T.-Y. Lin, R. Vedantam, S. Gupta, P. Dollar, C.L. Zitnick, (2015) Microsoft COCO Captions: Data Collection and Evaluation Server. arXiv:1504.00325.

5. J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, J. Wang, (2018) Long Text Generation via Adversarial Training with Leaked Information, in: The Thirty-Two AAAI Conference on Artificial Intelligence. vol. 32. no. 1. pp. 5141-5148. arXiv:1709.08624.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing Cloud-Native System Robustness: Time-Series GAN-Powered Fault Injection Model;2023 5th International Conference on Frontiers Technology of Information and Computer (ICFTIC);2023-11-17

2. Trends and Challenges of Text-to-Image Generation: Sustainability Perspective;Croatian Regional Development Journal;2023-06-01