How to Build Ethics into Robust Artificial Intelligence
Humans take great pride in being the only creatures who make moral judgments, even though their moral judgments often suffer from serious flaws. Some AI systems do generate decisions based on their consequences, but consequences are not all there is to morality. Moral judgments are also affected by rights (such as privacy), roles (such as in families), past actions (such as promises), motives and intentions, and other morally relevant features. These diverse factors have not yet been built into AI systems. Our goal is to do just that. Our team plans to combine methods from computer science, philosophy, and psychology in order to construct an AI system that is capable of making plausible moral judgments and decisions in realistic scenarios. We hope that this work will provide a basis that leads to future highly-advanced AI systems acting ethically and thereby being more robust and beneficial. Humans, by comparing their own moral judgments to the output of the resulting system, will be able to understand their own moral judgments and avoid common mistakes (such as partiality and overlooking relevant factors). In these ways and more, moral AI might also make humans more moral.
Most contemporary AI systems base their decisions solely on consequences, whereas humans also consider other morally relevant factors, including rights (such as privacy), roles (such as in families), past actions (such as promises), motives and intentions, and so on. Our goal is to build these additional morally relevant features into an AI system. We will identify morally relevant features by reviewing theories in moral philosophy, conducting surveys in moral psychology, and using machine learning to locate factors that affect human moral judgments. We will use and extend game theory and social choice theory to determine how to make these features more precise, how to weigh conflicting features against each other, and how to build these features into an AI system. We hope that eventually this work will lead to highly advanced AI systems that are capable of making moral judgments and acting on them. Humans will then be able to compare these outputs to their own moral judgments in order to learn which of these judgments are distorted by biases, partiality, or lack of attention to relevant factors. In such ways, moral AI can also contribute to our own understanding of morality and our moral lives.