Author:
,Skurzhanskyi O.H.,Marchenko O.O., ,Anisimov A.V.,
Abstract
SPECIALIZED PRE-TRAINING OF NEURAL NETWORKS ON SYNTHETIC DATA FOR IMPROVING PARAPHRASE GENERATION Abstract. Generating paraphrases is a fundamental problem in natural language processing. In light of the significant success of transfer learning technology, the “pre-training fine-tuning” approach has become the standard. However, popular general-purpose pre-training methods typically require large datasets and computational resources, and available pre-trained models are limited by fixed architecture and size. We propose a simple and effective approach for pre-training specifically for paraphrase generation, which significantly improves model quality and matches the quality level of general-purpose models. Both existing public data and new data generated by large language models were used. The impact of this procedure on neural networks of different architectures was investigated, and it was shown to work for all of them. Keywords: artificial intelligence, machine learning, neural networks, paraphrase generation, pre-training, fine-tuning.
Publisher
V.M. Glushkov Institute of Cybernetics
Reference27 articles.
1. 1. Han X., Zhang Z., Ding N., Gu Y., Liu X., Huo Y., Qiu J., Yao Y., Zhang A., Zhang L., et al. Pre-trained models: past, present and future. AI Open. 2021. Vol. 2. P. 225-250. https://doi.org/10.1016/j.aiopen.2021.08.002.
2. 2. Zhao W., Wang L., Shen K., Jia R., Liu J. Improving grammatical error correction via pre-training a copy-augmented architecture with unlabeled data. https://doi.org/10.48550/arXiv.1903.00138.
3. 3. Omelianchuk K., Atrasevych V., Chernodub A., Skurzhanskyi O. GECToR-Grammatical error correction: tag, not rewrite. arXiv:2005.12592v2 [cs.CL] 29 May 2020. https://doi.org/10.48550/arXiv.2005.12592.
4. 4. Kasai J., Pappas N., Peng H., Cross J., Smith N.A. Deep encoder, shallow decoder: reevaluating non-autoregressive machine translation. 2020. arXiv:2006.10369v4 [cs.CL]. 24 Jun 2021. https://doi.org/10.48550/arXiv.2006.10369.
5. 5. Wieting J., Gimpel K. ParaNMT-50M: pushing the limits of paraphrastic sentence embeddings with millions of machine translations. arXiv:1711.05732v2 [cs.CL] 20 Apr 2018. https://doi.org/10.48550/arXiv.1711.05732.