AI Safety Research

Daniel Weld

Thomas J. Cable / WRF Professor of Computer Science & Engineering and Entrepreneurial Faculty Fellow

University of Washington

Project: Computational Ethics for Probabilistic Planning

Amount Recommended:    $200,000

Project Summary

AI systems, whether robotic or conversational software agents, use planning algorithms to achieve high-level goals by exhaustively considering all possible sequences of actions. While these methods are increasingly powerful and can even generate seemly creative solutions, they have no understanding of ethics: they don’t understand harm nor can they distinguish between good and bad side effects of their actions. We propose to develop representations and algorithms fill this gap.

Technical Abstract

Recent advances in probabilistic planning and reinforcement learning have resulted in impressive performance at tasks as varied as mobile robotics, self-driving cars, and playing Atari video games. As these algorithms get deployed in real-world environments, it becomes critical to ensure that their utility-seeking behavior does not result in unintended, harmful side-effects. We need a way to specify a set of agent ethics: social norms that we can trust the agent will not knowingly violate. Developing mechanisms for defining and enforcing such ethical constraints requires innovations ranging from improved vocabulary grounding to more robust planning and reinforcement learning algorithms.



  1. “The real threat of artificial intelligence,” Geekwire, May 23, 2016.