AI Safety Research

Nick Bostrom

Professor, University of Oxford

Director Future of Humanity Institute

Director, Strategic Artificial Intelligence Research Center

Project: Strategic Research Center for Artificial Intelligence Policy

Amount Recommended:    $1,500,000

Project Summary

We propose the creation of a joint Oxford-Cambridge research center, which will develop policies to be enacted by governments, industry leaders, and others in order to minimize risks and maximize benefit from artificial intelligence (AI) development in the longer term. The center will focus explicitly on the long-term impacts of AI, the strategic implications of powerful AI systems as they come to exceed human capabilities in most domains of interest, and the policy responses that could best be used to mitigate the potential risks of this technology.

There are reasons to believe that unregulated and unconstrained development could incur significant dangers, both from “bad actors” like irresponsible governments, and from the unprecedented capability of the technology itself. For past high-impact technologies (e.g. nuclear fission), policy has often followed implementation, giving rise to catastrophic risks. It is important to avoid this with superintelligence: safety strategies, which may require decades to implement, must be developed before broadly superhuman, general-purpose AI becomes feasible.

This center represents a step change in technology policy: a comprehensive initiative to formulate, analyze, and test policy and regulatory approaches for a transformative technology in advance of its creation.

Governing AI: An Inside Look at the Quest to Ensure AI Benefits Humanity

Finance, education, medicine, programming, the arts — artificial intelligence is set to disrupt nearly every sector of our society. Governments and policy experts have started to realize that, in order to prepare for this future, in order to minimize the risks and ensure that AI benefits humanity, we need to start planning for the arrival of advanced AI systems today.

Although we are still in the early moments of this movement, the landscape looks promising. Several nations and independent firms have already started to strategize and develop polices for the governance of AI. Last year, the UAE appointed the world’s first Minister of Artificial Intelligence, and Germany took smaller, but similar, steps in 2017, when the Ethics Commission at the German Ministry of Transport and Digital Infrastructure developed the world’s first set of regulatory guidelines for automated and connected driving.

This work is notable; however, these efforts have yet to coalesce into a larger governance framework that extends beyond national boundaries. Nick Bostrom’s Strategic Artificial Intelligence Research Center seeks to assist in resolving this issue by understanding, and ultimately shaping, the strategic landscape of long-term AI development on a global scale.

Developing a Global Strategy: Where We Are Today

The Strategic Artificial Intelligence Research Center was founded in 2015 with the knowledge that, to truly circumvent the threats posed by AI, the world needs a concerted effort focused on tackling unsolved problems related to AI policy and development. The Governance of AI Program (GovAI), co-directed by Bostrom and Allan Dafoe, is the primary research program that has evolved from this center. Its central mission, as articulated by the directors, is to “examine the political, economic, military, governance, and ethical dimensions of how humanity can best navigate the transition to such advanced AI systems.” In this respect, the program is focused on strategy — on shaping the social, political, and governmental systems that influence AI research and development — as opposed to focusing on the technical hurdles that must be overcome in order to create and program safe AI.

To develop a sound AI strategy, the program works with social scientists, politicians, corporate leaders, and artificial intelligence/machine learning engineers to address questions of how we should approach the challenge of governing artificial intelligence. In a recent 80,0000 Hours podcast with Rob Wiblin, Dafoe outlined how the team’s research shapes up from a practical standpoint, asserting that the work focuses on answering questions that fall under three primary categories:

  • The Technical Landscape: This category seeks to answer all the questions that are related to research trends in the field of AI with the aim of understanding what future technological trajectories are plausible and how these trajectories affect the challenges of governing advanced AI systems.
  • AI Politics: This category focuses on questions that are related to the dynamics of different groups, corporations, and governments pursuing their own interests in relation to AI, and it seeks to understand what risks might arise as a result and how we may be able to mitigate these risks.
  • AI Governance: This category examines positive visions of a future in which humanity coordinates to govern advanced AI in a safe and robust manner. This raises questions such as how this framework should operate and what values we would want to encode in a governance regime.

The above categories provide a clearer way of understanding the various objectives of those invested in researching AI governance and strategy; however, these categories are fairly large in scope. To help elucidate the work they are performing, Jade Leung, a researcher with GovAI and a DPhil candidate in International Relations at the University of Oxford, outlined some of the specific workstreams that the team is currently pursuing.

One of the most intriguing areas of research is the Chinese AI Strategy workstream. This line of research examines things like China’s AI capabilities vis-à-vis other countries, official documentation regarding China’s AI policy, and the various power dynamics at play in the nation with an aim of understanding, as Leung summarizes, “China’s ambition to become an AI superpower and the state of Chinese thinking on safety, cooperation, and AGI.” Ultimately, GovAI seeks to outline the key features of China’s AI strategy in order to understand one of the most important actors in AI governance. The program published Deciphering China’s AI Dream in March of 2018a report that analyzes new features of China’s national AI strategy, and has plans to build upon research in the near future.

Another workstream is Firm-Government Cooperation, which examines the role that private firms play in relation to the development of advanced AI and how these players are likely to interact with national governments. In a recent talk at EA Global San Francisco, Leung focused on how private industry is already playing a significant role in AI development and why, when considering how to govern AI, private players must be included in strategy considerations as a vital part of the equation. The description of the talk succinctly summarizes the key focal areas, noting that “private firms are the only prominent actors that have expressed ambitions to develop AGI, and lead at the cutting edge of advanced AI research. It is therefore critical to consider how these private firms should be involved in the future of AI governance.”

Other work that Leung highlighted includes modeling technology race dynamics and analyzing the distribution of AI talent and hardware globally.

The Road Ahead

When asked how much confidence she has that AI researchers will ultimately coalesce and be successful in their attempts to shape the landscape of long-term AI development internationally, Leung was cautious with her response, noting that far more hands are needed. “There is certainly a greater need for more researchers to be tackling these questions. As a research area as well as an area of policy action, long-term safe and robust AI governance remains a neglected mission,” she said.

Additionally, Leung noted that, at this juncture, although some concrete research is already underway, a lot of the work is focused on framing issues related to AI governance and, in so doing, revealing the various avenues in need of research. As a result, the team doesn’t yet have concrete recommendations for specific actions governing bodies should commit to, as further foundational analysis is needed. “We don’t have sufficiently robust and concrete policy recommendations for the near term as it stands, given the degrees of uncertainty around this problem,” she said.

However, both Leung and Defoe are optimistic and assert that this information gap will likely change — and rapidly. Researchers across disciplines are increasingly becoming aware of the significance of this topic, and as more individuals begin researching and participating in this community, the various avenues of research will become more focused. “In two years, we’ll probably have a much more substantial research community. But today, we’re just figuring out what are the most important and tractable problems and how we can best recruit to work on those problems,” Dafoe told Wiblin.

The assurances that a more robust community will likely form soon are encouraging; however, questions remain regarding whether this community will come together with enough time to develop a solid governance framework. As Dafoe notes, we have never witnessed an intelligence explosion before, so we have no examples to look to for guidance when attempting to develop projections and timelines regarding when we will have advanced AI systems.

Ultimately, the lack of projections is precisely why we must significantly invest in AI strategy research in the immediate future. As Bostrom notes in Superintelligence: Paths, Dangers, and StrategiesAI is not simply a disruptive technology, it is likely the most disruptive technology humanity will ever encounter: “ is quite possibly the most important and most daunting challenge humanity has ever faced. And — whether we succeed or fail — it is probably the last challenge we will ever face.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.


  1. Bostrom, N. Strategic Implications of Openness in AI Development. April 2016.
    • Covers a breadth of areas identified in the grant application for research, including long-term AI development, singleton versus multipolar scenarios, race dynamics, responsible AI development, identification of failure modes, and forecasting key intervention points.
  2. Ord, Toby. Lessons from the Development of the Atomic Bomb (draft). April 2016.
    • Looks at the technical, political, and strategic aspects of the development of the atomic bomb and compares these directly to considerations surrounding the development of AI. This paper covers core SAIRC research areas including information hazards, racy dynamics, information security, geopolitical competition, resource mobilization, decisive strategic advantage, relevant policy and legal precedents for the control and regulation of AI and its development and proliferation, research ethics and the role of researchers in the eventual control of dangerous technologies, and other related areas.
    • It has been accepted for publication in the Bulletin of Atomic Scientists.
  3. Armstrong, Stuart, et al. Safely Interruptible Agents. April 2016.
    • This paper was presented at UAI and attracted an enormous amount of media attention, with hundreds of articles.


  1. The Control Problem of AI: May 2016. Oxford.
    • This workshop covered goals and principles of AI policy and strategy, value alignment for advanced machine learning, the relative importance of AI versus other x-risk, geopolitical strategy, government involvement, analysis of the strategic landscape, theory and methods of communication and engagement, the prospects of international-space-station-like coordinated AGI development, and an enormous array of technical AI control topics.
  2. Policies for Responsible AI Development: Future of Humanity Institute, Oxford.
    • This workshop focused on classifying risks, international governance, and surveillance. The researchers engaged in a series of brainstorming and analysis exercises as a group.

Ongoing Projects

  1. Bostrom and Professor Allan Dafoe of Yale University are working on a paper outlining the ‘desiderata for AI outcomes, and potential paths to achieving these.
  2. Andrew Snyder-Beattie has drafted an early version of a paper on the strategic importance and significance of warning shots.
  3. Toby Ord has presented an early version of a paper looking at the dynamics of arms races in AI development. It suggests that there might be strong incentives to avoid ‘races to the precipice’ if all players are rational actors.
  4. Ben Livingston is co-authoring a paper with researchers at MIRI on ‘Decision Theory and Logical Uncertainty” for purposes of helping advance the technical research agenda on the control problem.
  5. Andrew Snyder-Beattie, Niel Bowerman, and Peter McIntyre engaged in a multi-week project of exploratory and agenda setting research on technology forecasting, especially looking at the possible future of blockchain technologies and its strategic and security role in the future, including as it relates to AI strategy.
  6. Anders Sandberg drafted a detailed research agenda on the topic of information hazards.
  7. Carrick Flynn has been working with Allan Dafoe to set up a “Global Politics of AI Research Group. It will look at issues of geopolitical strategy, the international and domestic politics of AI development, AI development scenarios, forecasting, policy levers for great power peace, and tracking international investments in AI technologies among other areas.
  8. Cambridge is working to set up the Leverhulme Centre for the Future of Intelligence. This will greatly leverage the work of SAIRC over the coming decade, growing the community of academics and other partners bringing new expertise in governance, policy, technical machine learning and other disciplines to bear on the challenges of safe development of AI.