Artificial Intelligence, Values, and Alignment-Reference-Cited by-同舟云学术

Artificial Intelligence, Values, and Alignment

Published:2020-09 Issue:3 Volume:30 Page:411-437
ISSN:0924-6495
Container-title:Minds and Machines
language:en
Short-container-title:Minds & Machines

Author:

Gabriel Iason^ORCID

Abstract

AbstractThis paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Philosophy

Link

https://link.springer.com/content/pdf/10.1007/s11023-020-09539-2.pdf

Reference116 articles.

1. Abbeel, P. & Ng, A.Y. (2004, July). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning (p. 1). ACM.

2. Achiam, J., Held, D., Tamar, A. & Abbeel, P. (2017, August). Constrained policy optimization. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 22–31). JMLR.org.

3. Allen, C., Smit, I., & Wallach, W. (2005). Artificial morality: Top-down, bottom-up, and hybrid approaches. Ethics and Information Technology, 7(3), 149–155.

4. Arkin, R. C., Ulam, P. D., & Duncan, B. (2009). An ethical governor for constraining lethal action in an autonomous system. Georgia: Georgia Institute of Technology.

5. Armstrong, S. (2019). Research Agenda v0.9: Synthesising a human’s preferences into a utility function. 17 June. Lesswrong. Available at: https://www.lesswrong.com/posts/CSEdLLEkap2pubjof/research-agenda-v0-9-synthesising-a-human-s-preferences-into-1.

Cited by 150 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Aggregating value systems for decision support;Knowledge-Based Systems;2024-03

2. On monitorability of AI;AI and Ethics;2024-02-06

3. What ethics can say on artificial intelligence: Insights from a systematic literature review;Business and Society Review;2024-02-04

4. Ethics of generative AI and manipulation: a design-oriented research agenda;Ethics and Information Technology;2024-02-03

5. Beyond the Business Case for Responsible Artificial Intelligence: Strategic CSR in Light of Digital Washing and the Moral Human Argument;Sustainability;2024-02-01