New Center for Human-Compatible AI

Published:

August 31, 2016

Author:

Ariel Conn

The new center will be funded, primarily, by a generous grant from the Open Philanthropy Project for $5,555,550. The center will focus on research around value alignment, in which AI systems and robots will be trained using novel methods to understand what a human really wants, rather than just relying on initial programming.

Russell is most well known as the co-author of Artificial Intelligence: A Modern Approach, which has become the standard textbook for AI students. However, in recent years, Russell has also become an increasingly strong advocate for AI safety research and ensuring that the goals of artificial intelligence align with the goals of humans.

In a statement to FLI, Russell (who also sits on the FLI Science Advisory Board) said:

“I’m thrilled to have the opportunity to launch a serious attack on what is — as Nick Bostrom has called it — ‘the essential task of our age.’ It’s obviously in the very early stages but our work (funded previously by FLI) is already leading to some surprising new ideas for what safe AI systems might look like. We hope to find some excellent PhD students and postdocs and to start training the researchers who will take this forward.”

An example of this type of research can be seen in a paper published this month by Russell and other researchers on Cooperative Inverse Reinforcement Learning (CIRL). In inverse reinforcement learning, the AI system or robot has to learn a human’s goals by observing the human in a real-world or simulated environment, and CIRL is a potentially more effective method for teaching the AI to achieve this. In a press release about the new center, the Open Philanthropy Project listed other possible research avenues, such as:

“Value alignment through, e.g., inverse reinforcement learning from multiple sources (such as text and video).
“Value functions defined by partially observable and partially defined terms (e.g. ‘health,’ ‘death’).
“The structure of human value systems, and the implications of computational limitations and human inconsistency.
“Conceptual questions including the properties of ideal value systems, tradeoffs among humans and long-term stability of values.”

Other funders include the Future of Life Institute and the Defense Advanced Research Projects Agency, and other co-PIs and collaborators include:

Pieter Abbeel, Associate Professor of Computer Science, UC Berkeley
Anca Dragan, Assistant Professor of Computer Science, UC Berkeley
Tom Griffiths, Professor of Psychology and Cognitive Science, UC Berkeley
Bart Selman, Professor of Computer Science, Cornell University
Joseph Halpern, Professor of Computer Science, Cornell University
Michael Wellman, Professor of Computer Science, University of Michigan
Satinder Singh Baveja, Professor of Computer Science, University of Michigan

In their press release, the Open Philanthropy Project added:

“We also believe that supporting Professor Russell’s work in general is likely to be beneficial. He appears to us to be more focused on reducing potential risks of advanced artificial intelligence (particularly the specific risks we are most focused on) than any comparably senior, mainstream academic of whom we are aware. We also see him as an effective communicator with a good reputation throughout the field.”

This content was first published at futureoflife.org on August 31, 2016.

About the Future of Life Institute

The Future of Life Institute (FLI) is a global think tank with a team of 20+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

New Center for Human-Compatible AI

Contents

About the Future of Life Institute

Related content

Other posts about AI, Partner Orgs, Recent News

Preparing for an AI Economy (with Daniel Susskind)

Will AI Companies Respect Creators’ Rights? (with Ed Newton-Rex)

AI Timelines and Human Psychology (with Sarah Hastings-Woodhouse)

Could Powerful AI Break Our Fragile World? (with Michael Nielsen)

Sign up for the Future of Life Institute newsletter