Artificial Intelligence, Values, and Alignment
- PDF / 710,181 Bytes
- 27 Pages / 439.37 x 666.142 pts Page_size
- 22 Downloads / 262 Views
Artificial Intelligence, Values, and Alignment Iason Gabriel1 Received: 22 February 2020 / Accepted: 26 August 2020 © The Author(s) 2020
Abstract This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified. Keywords Artificial intelligence · Machine learning · Value alignment · Moral philosophy · Political theory
1 Introduction The development and growth of artificial intelligence raises new and important questions for technologists, for humanity, and for sentient life more widely. Foremost among these is the question of what—or whose—values AI systems ought to align with. One vision of AI is broadly utilitarian. It holds that over the long run these technologies should be designed to create the greatest happiness for the largest number of people or sentient creatures. Another approach is Kantian in character. It suggests that the principles governing AI should only be those that we could rationally will to be universal law, for example, principles of fairness or beneficence. Still other approaches focus directly on the role of human direction and volition. They suggest that the major moral challenge is to align AI with human instructions, * Iason Gabriel [email protected] 1
DeepMind, London, UK
13
Vol.:(0123456789)
I. Gabriel
intentions, or desires. However, this ability to understand and follow human volition might itself need to be constrained in certain ways—something that becomes clear when we think about the possibility of AI being used intentionally to harm others, or the possibility that it could be used in imprudent or self-destructive ways. To forestall these outcomes, it might be wise to design AI in a way that respects the objective interests of sentient beings or aligns with a conception of basic rights, so that there are limits on what it may permissibly do. Behind each vision for ethically-aligned AI sits a deeper question. How are we to decide which principles or objectives to encode in AI—and who has the right to make these decisions—given that we live in a pluralistic world that is full of competing conceptions of value? Is there a way to think about AI valu
Data Loading...