AI Safety Research

Nick Bostrom

Professor, University of Oxford

Director Future of Humanity Institute

Director, Strategic Artificial Intelligence Research Center

Project: Strategic Research Center for Artificial Intelligence Policy

Amount Recommended:    $1,500,000

Project Summary

We propose the creation of a joint Oxford-Cambridge research center, which will develop policies to be enacted by governments, industry leaders, and others in order to minimize risks and maximize benefit from artificial intelligence (AI) development in the longer term. The center will focus explicitly on the long-term impacts of AI, the strategic implications of powerful AI systems as they come to exceed human capabilities in most domains of interest, and the policy responses that could best be used to mitigate the potential risks of this technology.

There are reasons to believe that unregulated and unconstrained development could incur significant dangers, both from “bad actors” like irresponsible governments, and from the unprecedented capability of the technology itself. For past high-impact technologies (e.g. nuclear fission), policy has often followed implementation, giving rise to catastrophic risks. It is important to avoid this with superintelligence: safety strategies, which may require decades to implement, must be developed before broadly superhuman, general-purpose AI becomes feasible.

This center represents a step change in technology policy: a comprehensive initiative to formulate, analyze, and test policy and regulatory approaches for a transformative technology in advance of its creation.


  1. Bostrom, N. Strategic Implications of Openness in AI Development. April 2016.
    • Covers a breadth of areas identified in the grant application for research, including long-term AI development, singleton versus multipolar scenarios, race dynamics, responsible AI development, identification of failure modes, and forecasting key intervention points.
  2. Ord, Toby. Lessons from the Development of the Atomic Bomb (draft). April 2016.
    • Looks at the technical, political, and strategic aspects of the development of the atomic bomb and compares these directly to considerations surrounding the development of AI. This paper covers core SAIRC research areas including information hazards, racy dynamics, information security, geopolitical competition, resource mobilization, decisive strategic advantage, relevant policy and legal precedents for the control and regulation of AI and its development and proliferation, research ethics and the role of researchers in the eventual control of dangerous technologies, and other related areas.
    • It has been accepted for publication in the Bulletin of Atomic Scientists.
  3. Armstrong, Stuart, et al. Safely Interruptible Agents. April 2016.
    • This paper was presented at UAI and attracted an enormous amount of media attention, with hundreds of articles.


  1. The Control Problem of AI: May 2016. Oxford.
    • This workshop covered goals and principles of AI policy and strategy, value alignment for advanced machine learning, the relative importance of AI versus other x-risk, geopolitical strategy, government involvement, analysis of the strategic landscape, theory and methods of communication and engagement, the prospects of international-space-station-like coordinated AGI development, and an enormous array of technical AI control topics.
  2. Policies for Responsible AI Development: Future of Humanity Institute, Oxford.
    • This workshop focused on classifying risks, international governance, and surveillance. The researchers engaged in a series of brainstorming and analysis exercises as a group.

Ongoing Projects

  1. Bostrom and Professor Allan Dafoe of Yale University are working on a paper outlining the ‘desiderata for AI outcomes, and potential paths to achieving these.
  2. Andrew Snyder-Beattie has drafted an early version of a paper on the strategic importance and significance of warning shots.
  3. Toby Ord has presented an early version of a paper looking at the dynamics of arms races in AI development. It suggests that there might be strong incentives to avoid ‘races to the precipice’ if all players are rational actors.
  4. Ben Livingston is co-authoring a paper with researchers at MIRI on ‘Decision Theory and Logical Uncertainty” for purposes of helping advance the technical research agenda on the control problem.
  5. Andrew Snyder-Beattie, Niel Bowerman, and Peter McIntyre engaged in a multi-week project of exploratory and agenda setting research on technology forecasting, especially looking at the possible future of blockchain technologies and its strategic and security role in the future, including as it relates to AI strategy.
  6. Anders Sandberg drafted a detailed research agenda on the topic of information hazards.
  7. Carrick Flynn has been working with Allan Dafoe to set up a “Global Politics of AI Research Group. It will look at issues of geopolitical strategy, the international and domestic politics of AI development, AI development scenarios, forecasting, policy levers for great power peace, and tracking international investments in AI technologies among other areas.
  8. Cambridge is working to set up the Leverhulme Centre for the Future of Intelligence. This will greatly leverage the work of SAIRC over the coming decade, growing the community of academics and other partners bringing new expertise in governance, policy, technical machine learning and other disciplines to bear on the challenges of safe development of AI.