Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation-Reference-Cited by-同舟云学术

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

Published:2023 Issue: Volume:11 Page:1643-1668
ISSN:2307-387X
Container-title:Transactions of the Association for Computational Linguistics
language:en
Short-container-title:

Author:

Fernandes Patrick¹²³,Madaan Aman⁴,Liu Emmy⁴,Farinhas António²³,Martins Pedro Henrique⁵,Bertsch Amanda⁴,de Souza José G. C.⁵,Zhou Shuyan⁴,Wu Tongshuang⁴,Neubig Graham⁴⁶,Martins André F. T.²³⁵

Affiliation:

1. Carnegie Mellon University, USA. pfernand@cs.cmu.edu

2. Instituto Superior Técnico (Lisbon ELLIS Unit), Portugal

3. Instituto de Telecomunicações, Portugal

4. Carnegie Mellon University, USA

5. Unbabel, Portugal

6. Inspired Cognition, USA

Abstract

Abstract Natural language generation has witnessed significant advancements due to the training of large language models on vast internet-scale datasets. Despite these advancements, there exists a critical challenge: These models can inadvertently generate content that is toxic, inaccurate, and unhelpful, and existing automatic evaluation metrics often fall short of identifying these shortcomings. As models become more capable, human feedback is an invaluable signal for evaluating and improving models. This survey aims to provide an overview of recent research that has leveraged human feedback to improve natural language generation. First, we introduce a taxonomy distilled from existing research to categorize and organize the varied forms of feedback. Next, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using feedback or training feedback models. We also discuss existing datasets for human-feedback data collection, and concerns surrounding feedback collection. Finally, we provide an overview of the nascent field of AI feedback, which uses large language models to make judgments based on a set of principles and minimize the need for human intervention. We also release a website of this survey at feedback-gap-survey.info.

Publisher

MIT Press

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Link

https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00626/2199585/tacl_a_00626.pdf

Reference160 articles.

1. Rl4f: Generating natural language feedback with reinforcement learning for repairing model outputs;Akyürek,2023

2. Power to the people: The role of humans in interactive machine learning;Amershi;AI Magazine,2014

3. Concrete problems in AI safety;Amodei;CoRR,2016

4. Identifying weaknesses in machine translation metrics through minimum Bayes risk decoding: A case study for COMET;Amrhein,2022

5. Director: Generator-classifiers for supervised language modeling;Arora,2022

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Towards AI for Software Systems;Proceedings of the 1st ACM International Conference on AI-Powered Software;2024-07-10

2. LLMChain: Blockchain-Based Reputation System for Sharing and Evaluating Large Language Models;2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC);2024-07-02

3. Mitigating Hallucination Issues in Small-Parameter LLMs through Inter-Layer Contrastive Decoding;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

4. RELIC: Investigating Large Language Model Responses using Self-Consistency;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

5. A Survey of LLM Datasets: From Autoregressive Model to AI Chatbot;Journal of Computer Science and Technology;2024-05