Skip to content

AI Researcher Adrian Weller

Published:
October 1, 2016
Author:
Revathi Kumar

Contents

AI Safety Research




Adrian Weller

Senior Research Fellow, Department of Engineering

University of Cambridge

aw665@cam.ac.uk

Project: Investigation of Self-Policing AI Agents

Amount Recommended:    $50,000




Project Summary

We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.

Technical Abstract

We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.




Workshops

  1. The Future of Artificial Intelligence: January 11-13, 2016. New York University, NY.
  2. Reliable Machine Learning in the Wild: June 23, 2016. ICML Workshop, NY.
    • This workshop discussed a wide range of issues related to engineering reliable AI systems. Among the questions discussed were (a) how to estimate causal effects under various kinds of situations (A/B tests, domain adaptation, observational medical data), (b) how to train classifiers to be robust in the face of adversarial attacks (on both training and test data), (c) how to train reinforcement learning systems with risk-sensitive objectives, especially when the model class may be misspecified and the observations are incomplete, and (d) how to guarantee that a learned policy for an MDP satisfies specified temporal logic properties. Several important engineering practices were also discussed, especially engaging a Red Team to perturb/poison data and making sure we are measuring the right data.
    • More details of the workshop can be found at our website: https://sites.google.com/site/wildml2016/.

This content was first published at futureoflife.org on October 1, 2016.

About the Future of Life Institute

The Future of Life Institute (FLI) is a global non-profit with a team of 20+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

Related content

Other posts about 

If you enjoyed this content, you also might also be interested in:

AI Researcher Bas Steunebrink

AI Safety Research Bas Steunebrink Artificial Intelligence / Machine Learning, Postdoctoral Researcher IDSIA (Dalle Molle Institute for Artificial Intelligence) bas@idsia.ch […]
October 1, 2016

AI Researcher Moshe Vardi

AI Safety Research Moshe Vardi Computer Scientist, Professor Department of Computer Science Rice University vardi@cs.rice.edu Project: Artificial Intelligence and the […]
October 1, 2016

AI Researcher Manuela Veloso

AI Safety Research Manuela M. Veloso Herbert A. Simon University Professor Head, Machine Learning, Department School of Computer Science Carnegie […]
October 1, 2016
Wendall Wallace discusses his work in the fields of machine ethics, emerging technology and Ai governance.

AI Researcher Wendell Wallach

AI Safety Research Wendell Wallach Lecturer Yale Interdisciplinary Center for Bioethics wendell.wallach@yale.edu Project: Control and Responsible Innovation in the Development […]
October 1, 2016

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and cause areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram