Abstract
AbstractThe Morality First strategy for developing AI systems that can represent and respond to human values aims to first develop systems that can represent and respond to moral values. I argue that Morality First and other X-First views are unmotivated. Moreover, if one particular philosophical view about value is true, these strategies are positively distorting. The natural alternative according to which no domain of value comes “first” introduces a new set of challenges and highlights an important but otherwise obscured problem for e-AI developers.
Publisher
Springer Science and Business Media LLC
Reference47 articles.
1. Aristotle (2014) Aristotle: Nicomachean ethics. Cambridge University Press, Cambridge
2. Babic B (2019) A theory of epistemic risk. Philos Sci 86(3):522–550. https://doi.org/10.1086/703552
3. Bai Y, Jones A, Ndousse K, Askell A, Chen A, DasSarma N, Drain D et al (2022a) Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv https://doi.org/10.48550/arXiv.2204.05862
4. Bai Y, Kadavath S, Kundu S, Askell A, Kernion J, Jones A, Chen A et al (2022b) Constitutional AI: harmlessness from AI feedback. ArXiv.Org. https://arxiv.org/abs/2212.08073v1
5. Baker DC (2018) Skepticism about ought simpliciter. Oxford studies in metaethics, vol 13. Oxford University Press, Oxford