Skip to content

New Center for Human-Compatible AI

Published:
August 31, 2016
Author:
Ariel Conn

Contents

Congratulations to Stuart Russell for his recently announced launch of the Center for Human-Compatible AI!

The new center will be funded, primarily, by a generous grant from the Open Philanthropy Project for $5,555,550. The center will focus on research around value alignment, in which AI systems and robots will be trained using novel methods to understand what a human really wants, rather than just relying on initial programming.

Russell is most well known as the co-author of Artificial Intelligence: A Modern Approach, which has become the standard textbook for AI students. However, in recent years, Russell has also become an increasingly strong advocate for AI safety research and ensuring that the goals of artificial intelligence align with the goals of humans.

In a statement to FLI, Russell (who also sits on the FLI Science Advisory Board) said:

“I’m thrilled to have the opportunity to launch a serious attack on what is — as Nick Bostrom has called it — ‘the essential task of our age.’ It’s obviously in the very early stages but our work (funded previously by FLI) is already leading to some surprising new ideas for what safe AI systems might look like. We hope to find some excellent PhD students and postdocs and to start training the researchers who will take this forward.”

An example of this type of research can be seen in a paper published this month by Russell and other researchers on Cooperative Inverse Reinforcement Learning (CIRL). In inverse reinforcement learning, the AI system or robot has to learn a human’s goals by observing the human in a real-world or simulated environment, and CIRL is a potentially more effective method for teaching the AI to achieve this. In a press release about the new center, the Open Philanthropy Project listed other possible research avenues, such as:

  • “Value alignment through, e.g., inverse reinforcement learning from multiple sources (such as text and video).
  • “Value functions defined by partially observable and partially defined terms (e.g. ‘health,’ ‘death’).
  • “The structure of human value systems, and the implications of computational limitations and human inconsistency.
  • “Conceptual questions including the properties of ideal value systems, tradeoffs among humans and long-term stability of values.”

Other funders include the Future of Life Institute and the Defense Advanced Research Projects Agency, and other co-PIs and collaborators include:

  • Pieter Abbeel, Associate Professor of Computer Science, UC Berkeley
  • Anca Dragan, Assistant Professor of Computer Science, UC Berkeley
  • Tom Griffiths, Professor of Psychology and Cognitive Science, UC Berkeley
  • Bart Selman, Professor of Computer Science, Cornell University
  • Joseph Halpern, Professor of Computer Science, Cornell University
  • Michael Wellman, Professor of Computer Science, University of Michigan
  • Satinder Singh Baveja, Professor of Computer Science, University of Michigan

In their press release, the Open Philanthropy Project added:

“We also believe that supporting Professor Russell’s work in general is likely to be beneficial. He appears to us to be more focused on reducing potential risks of advanced artificial intelligence (particularly the specific risks we are most focused on) than any comparably senior, mainstream academic of whom we are aware. We also see him as an effective communicator with a good reputation throughout the field.”

This content was first published at futureoflife.org on August 31, 2016.

About the Future of Life Institute

The Future of Life Institute (FLI) is a global non-profit with a team of 20+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

Related content

Other posts about , ,

If you enjoyed this content, you also might also be interested in:

The Pause Letter: One year later

It has been one year since our 'Pause AI' open letter sparked a global debate on whether we should temporarily halt giant AI experiments.
March 22, 2024

Catastrophic AI Scenarios

Concrete examples of how AI could go wrong
February 1, 2024

Gradual AI Disempowerment

Could an AI takeover happen gradually?
February 1, 2024

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and cause areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram