Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks-Reference-Cited by-同舟云学术

Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks

Published:2024-07-18 Issue: Volume: Page:
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

La Quatra Moreno¹^ORCID,Gallipoli Giuseppe²^ORCID,Cagliero Luca²^ORCID

Affiliation:

1. Kore University of Enna, Italy

2. Politecnico di Torino, Italy

Abstract

Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3678179

Reference47 articles.

1. Liqun Chen et al. 2018. Adversarial Text Generation via Feature-Mover’s Distance. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, Vol. 31. Curran Associates, Inc., 4671–4682. https://proceedings.neurips.cc/paper/2018/hash/074177d3eb6371e32c16c55a3b8f706b-Abstract.html

2. Scaling instruction-finetuned language models;Chung Hyung Won;Journal of Machine Learning Research,2024

3. Style Transfer in Text: Exploration and Evaluation