AI Safety Research
- Bostrom, N. Strategic Implications of Openness in AI Development. April 2016.
- Covers a breadth of areas identified in the grant application for research, including long-term AI development, singleton versus multipolar scenarios, race dynamics, responsible AI development, identification of failure modes, and forecasting key intervention points.
- Ord, Toby. Lessons from the Development of the Atomic Bomb (draft). April 2016.
- Looks at the technical, political, and strategic aspects of the development of the atomic bomb and compares these directly to considerations surrounding the development of AI. This paper covers core SAIRC research areas including information hazards, racy dynamics, information security, geopolitical competition, resource mobilization, decisive strategic advantage, relevant policy and legal precedents for the control and regulation of AI and its development and proliferation, research ethics and the role of researchers in the eventual control of dangerous technologies, and other related areas.
- It has been accepted for publication in the Bulletin of Atomic Scientists.
- Armstrong, Stuart, et al. Safely Interruptible Agents. April 2016.
- This paper was presented at UAI and attracted an enormous amount of media attention, with hundreds of articles.
- The Control Problem of AI: May 2016. Oxford.
- This workshop covered goals and principles of AI policy and strategy, value alignment for advanced machine learning, the relative importance of AI versus other x-risk, geopolitical strategy, government involvement, analysis of the strategic landscape, theory and methods of communication and engagement, the prospects of international-space-station-like coordinated AGI development, and an enormous array of technical AI control topics.
- Policies for Responsible AI Development: Future of Humanity Institute, Oxford.
- This workshop focused on classifying risks, international governance, and surveillance. The researchers engaged in a series of brainstorming and analysis exercises as a group.
- Bostrom and Professor Allan Dafoe of Yale University are working on a paper outlining the ‘desiderata for AI outcomes, and potential paths to achieving these.
- Andrew Snyder-Beattie has drafted an early version of a paper on the strategic importance and significance of warning shots.
- Toby Ord has presented an early version of a paper looking at the dynamics of arms races in AI development. It suggests that there might be strong incentives to avoid ‘races to the precipice’ if all players are rational actors.
- Ben Livingston is co-authoring a paper with researchers at MIRI on ‘Decision Theory and Logical Uncertainty” for purposes of helping advance the technical research agenda on the control problem.
- Andrew Snyder-Beattie, Niel Bowerman, and Peter McIntyre engaged in a multi-week project of exploratory and agenda setting research on technology forecasting, especially looking at the possible future of blockchain technologies and its strategic and security role in the future, including as it relates to AI strategy.
- Anders Sandberg drafted a detailed research agenda on the topic of information hazards.
- Carrick Flynn has been working with Allan Dafoe to set up a “Global Politics of AI Research Group. It will look at issues of geopolitical strategy, the international and domestic politics of AI development, AI development scenarios, forecasting, policy levers for great power peace, and tracking international investments in AI technologies among other areas.
- Cambridge is working to set up the Leverhulme Centre for the Future of Intelligence. This will greatly leverage the work of SAIRC over the coming decade, growing the community of academics and other partners bringing new expertise in governance, policy, technical machine learning and other disciplines to bear on the challenges of safe development of AI.