STELA: a community-centred approach to norm elicitation for AI alignment
-
Published:2024-03-19
Issue:1
Volume:14
Page:
-
ISSN:2045-2322
-
Container-title:Scientific Reports
-
language:en
-
Short-container-title:Sci Rep
Author:
Bergman Stevie,Marchal Nahema,Mellor John,Mohamed Shakir,Gabriel Iason,Isaac William
Abstract
AbstractValue alignment, the process of ensuring that artificial intelligence (AI) systems are aligned with human values and goals, is a critical issue in AI research. Existing scholarship has mainly studied how to encode moral values into agents to guide their behaviour. Less attention has been given to the normative questions of whose values and norms AI systems should be aligned with, and how these choices should be made. To tackle these questions, this paper presents the STELA process (SocioTEchnical Language agent Alignment), a methodology resting on sociotechnical traditions of participatory, inclusive, and community-centred processes. For STELA, we conduct a series of deliberative discussions with four historically underrepresented groups in the United States in order to understand their diverse priorities and concerns when interacting with AI systems. The results of our research suggest that community-centred deliberation on the outputs of large language models is a valuable tool for eliciting latent normative perspectives directly from differently situated groups. In addition to having the potential to engender an inclusive process that is robust to the needs of communities, this methodology can provide rich contextual insights for AI alignment.
Publisher
Springer Science and Business Media LLC
Reference92 articles.
1. Marr, B. Microsoft’s plan to infuse AI and ChatGPT into everything. Forbes. https://www.forbes.com/sites/bernardmarr/2023/03/06/microsofts-plan-to-infuse-ai-and-chatgpt-into-everything/ (2023, March 6). 2. Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922 (2021). 3. Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D. & Liang, P. On the opportunities and risks of foundation models (arXiv:2108.07258). https://doi.org/10.48550/arXiv.2108.07258 (2022). 4. Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P.-S., Cheng, M., Glaese, M., Balle, B., Kasirzadeh, A., Kenton, Z., Brown, S., Hawkins, W., Stepleton, T., Biles, C., Birhane, A., Haas, J., Rimell, L., Hendricks, L. A. & Gabriel, I. Ethical and social risks of harm from Language Models (arXiv:2112.04359). https://doi.org/10.48550/arXiv.2112.04359 (2021). 5. Eubanks, V. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. (St. Martin’s Press, 2018). https://us.macmillan.com/books/9781250074317/automatinginequality.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. An Ellulian analysis of propaganda in the context of generative AI;Ethics and Information Technology;2024-09 2. Participation in the age of foundation models;The 2024 ACM Conference on Fairness, Accountability, and Transparency;2024-06-03
|
|