AI deception: A survey of examples, risks, and potential solutions-Reference-Cited by-同舟云学术

AI deception: A survey of examples, risks, and potential solutions

Published:2024-05 Issue:5 Volume:5 Page:100988
ISSN:2666-3899
Container-title:Patterns
language:en
Short-container-title:Patterns

Author:

Park Peter S.^ORCID,Goldstein Simon,O’Gara Aidan,Chen Michael,Hendrycks Dan

Publisher

Elsevier BV

Reference94 articles.

1. ‘Godfather of AI’ Warns that AI May Figure Out How to Kill People;Hinton,2023

2. Truthful AI: Developing and governing AI that does not lie;Evans;arXiv,2021

3. Characterizing manipulation from AI systems;Carroll;Proceedings of the ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization,2023

4. Human-level play in the game of Diplomacy by combining language models with strategic reasoning;Bakhtin;Science,2022

5. Grandmaster level in StarCraft II using multi-agent reinforcement learning;Vinyals;Nature,2019

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The state as a model for AI control and alignment;AI & SOCIETY;2024-09-11

2. Aversion to external feedback suffices to ensure agent alignment;Scientific Reports;2024-09-10

3. Security, Risk Management, and Ethical AI in the Future of DeFi;Advances in Finance, Accounting, and Economics;2024-08-26

4. Deception abilities emerged in large language models;Proceedings of the National Academy of Sciences;2024-06-04

5. Yet Another Example of ChatGPT’s Evasive Tactics During Long Conversations: Japanese Rock Song Lyrics Case;2024 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA);2024-05-07