AI Safety Research

Seth Baum

Executive Director, Global Catastrophic Risk Institute

Project: Evaluation of Safe Development Pathways for Artificial Superintelligence

Amount Recommended:    $100,000

Project Summary

Some experts believe that computers could eventually become a lot smarter than humans are. They call it artificial superintelligence, or ASI. If people build ASI, it could be either very good or very bad for humanity. However, ASI is not well understood, which makes it difficult for people to act to enable good ASI and avoid bad ASI. Our project studies the ways that people could build ASI in order to help people act in better ways. We will model the different steps that need to occur for people to build ASI. We will estimate how likely it is that these steps will occur, and when they might occur. We will also model the actions people can take, and we will calculate how much the actions will help. For example, governments may be able to require that ASI researchers build in safety measures. Our models will include both the government action and the ASI safety measures, to learn about how well it all works. This project is an important step towards making sure that humanity avoids bad ASI and, if it wishes, creates good ASI.

Technical Abstract

Artificial superintelligence (ASI) has been proposed to be a major transformative future technology, potentially resulting in either massive improvement in the human condition or existential catastrophe. However, the opportunities and risks remain poorly characterized and quantified. This reduces the effectiveness of efforts to steer ASI development towards beneficial outcomes and away from harmful outcomes. While deep uncertainty inevitably surrounds such a breakthrough future technology, significant progress can be made now using available information and methods. We propose to model the human process of developing ASI. ASI would ultimately be a human creation; modeling this process indicates the probability of various ASI outcomes and illuminates a range of ways to improve outcomes. We will characterize the development pathways that can result in beneficial or dangerous ASI outcomes. We will apply risk analysis and decision analysis methods to quantify opportunities and risks, and to evaluate opportunities to make ASI less risky and more beneficial. Specifically, we will use fault trees and influence diagrams to map out ASI development pathways and the influence that various actions have on these pathways. Our proposed project will produce the first-ever analysis of ASI development using rigorous risk and decision analysis methodology.


  1. Pistono, F and Yampolskiy, RV. Unethical research: How to create a malevolent artificial intelligence. 25th International Joint Conference on Artificial Intelligence (IJCAI-16), Ethics for Artificial Intelligence Workshop (AI-Ethics-2016).
  2. Yampolskiy, RV. Taxonomy of pathways to dangerous AI. 30th AAAI Conference on Artificial Intelligence (AAAI-2016), 2nd International Workshop on AI, Ethics and Society (AI Ethics Society 2016).


  1. Seth Baum, Anthony Barrett, and Roman Yampolskiy presented their research at the 2015 Society for Risk Analysis Annual Meeting
  2. Seth Baum organized several informal meetings on AI safety with attendees from (among other places) CSER, FHI, MIRI, Yale, and the United Nations at the International Joint Conference on Artificial Intelligence

Ongoing Projects/Recent Progress

  1. Seth Baum, Anthony Barrett, and Roman Yampolskiy – Goal: to apply their model to analyze a debate between Ben Goertzel and Nick Bostrom, two longstanding superintelligence thought leaders. These researchers take the perspective of an observer who is uncertain whose argument is correct and model the terms of the debate and the uncertainty about the arguments. This project advances our overall research agenda while contributing to an important current debate in the superintelligence research community.
  2. Seth Baum, Anthony Barrett, and Roman Yampolskiy – Goal: to expand our core model and the surrounding discussion. The document containing this work is currently approximately 100 pages. These researchers plan to expand and refine this document over the duration of the project for eventual publication as a book. The book would be the first of its kind on superintelligence risk analysis, laying out their vision for the subject and their foundational research on it.